

Supporting Streaming Updates in an Active Data Warehouse

15 years 1 months ago
Supporting Streaming Updates in an Active Data Warehouse
Active Data Warehousing has emerged as an alternative to conventional warehousing practices in order to meet the high demand of applications for up-to-date information. In a nutshell, an active warehouse is refreshed on-line and thus achieves a higher consistency between the stored information and the latest data updates. The need for on-line warehouse refreshment introduces several challenges in the implementation of data warehouse transformations, with respect to their execution time and their overhead to the warehouse processes. In this paper, we focus on a frequently encountered operation in this context, namely, the join of a fast stream S of source updates with a disk-based relation R, under the constraint of limited memory. This operation lies at the core of several common transformations, such as, surrogate key assignment, duplicate detection or identification of newly inserted tuples. We propose a specialized join algorithm, termed mesh join (MESHJOIN), that compensates for t...
Neoklis Polyzotis, Spiros Skiadopoulos, Panos Vass
Added 01 Nov 2009
Updated 01 Nov 2009
Type Conference
Year 2007
Where ICDE
Authors Neoklis Polyzotis, Spiros Skiadopoulos, Panos Vassiliadis, Alkis Simitsis, Nils-Erik Frantzell
Comments (0)