Sciweavers

ICDE
2011
IEEE

Distributed cube materialization on holistic measures

13 years 3 months ago
Distributed cube materialization on holistic measures
—Cube computation over massive datasets is critical for many important analyses done in the real world. Unlike commonly studied algebraic measures such as SUM that are amenable to parallel computation, efficient cube computation of holistic measures such as TOP-K is non-trivial and often impossible with current methods. In this paper we detail real-world challenges in cube materialization tasks on Web-scale datasets. Specifically, we identify an important subset of holistic measures and introduce MR-Cube, a MapReduce based framework for efficient cube computation on these measures. We provide extensive experimental analyses over both real and synthetic data. We demonstrate that, unlike existing techniques which cannot scale to the 100 million tuple mark for our datasets, MR-Cube successfully and efficiently computes cubes with holistic measures over billion-tuple datasets.
Arnab Nandi, Cong Yu, Philip Bohannon, Raghu Ramak
Added 21 Aug 2011
Updated 21 Aug 2011
Type Journal
Year 2011
Where ICDE
Authors Arnab Nandi, Cong Yu, Philip Bohannon, Raghu Ramakrishnan
Comments (0)