Data Cube Materialization and Mining over MapReduce

12 years 7 months ago

Download arnab.org

—Computing interesting measures for data cubes and subsequent mining of interesting cube groups over massive datasets are critical for many important analyses done in the real world. Previous studies have focused on algebraic measures such as SUM that are amenable to parallel computation and can easily beneﬁt from the recent advancement of parallel computing infrastructure such as MapReduce. Dealing with holistic measures such as TOP-K, however, is non-trivial. In this paper we detail real-world challenges in cube materialization and mining tasks on Web-scale datasets. Speciﬁcally, we identify an important subset of holistic measures and introduce MR-Cube, a MapReduce based framework for efﬁcient cube computation and identiﬁcation of interesting cube groups on holistic measures. We provide extensive experimental analyses over both real and synthetic data. We demonstrate that, unlike existing techniques which cannot scale to the 100 million tuple mark for our datasets, MR-Cube...

Arnab Nandi, Cong Yu, Philip Bohannon, Raghu Ramak

Real-time Traffic

Formal Methods | Massive Datasets | Scale Datasets | TKDE 2012 | World Challenges |

claim paper

» Densest Subgraph in Streaming and MapReduce

» Materialized View Selection for MultiCube Data Models

» BitCube A BottomUp Cubing Engineering

» ARCube supporting ranking aggregate queries in partially materialized data cubes

» Improved Data Partitioning for Building Large ROLAP Data Cubes in Parallel

» NetCube A Scalable Tool for Fast Data Mining and Compression

» Incremental Data Mining Using Concurrent Online Refresh of Materialized Data Mining Views

» Graph OLAP Towards Online Analytical Processing on Graphs

Post Info
More Details (n/a)

Added	29 Sep 2012
Updated	29 Sep 2012
Type	Journal
Year	2012
Where	TKDE
Authors	Arnab Nandi, Cong Yu, Philip Bohannon, Raghu Ramakrishnan

Comments (0)

Sciweavers

Data Cube Materialization and Mining over MapReduce

Formal Methods | Massive Datasets | Scale Datasets | TKDE 2012 | World Challenges |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers