A warehouse is a data repository containing integrated information for e cient querying and analysis. Maintaining the consistencyof warehouse data is challenging, especially if the data sources are autonomous and views of the data at the warehouse span multiple sources. Transactions containing multiple updates at one or more sources, e.g., batch updates, complicate the consistency problem. In this paper we identify and discuss three fundamental transaction processing scenarios for data warehousing. We de ne four levels of consistency for warehouse data and present a new family of algorithms, the Strobe family, that maintain consistency as the warehouse is updated, under the various warehousing scenarios. All of the algorithms are incrementaland can handlea continuousandoverlappingstreamof updatesfrom the sources. Our implementation shows that the algorithms are practical and realistic choices for a wide variety of update scenarios.
Yue Zhuge, Hector Garcia-Molina, Janet L. Wiener