Parallelizing the Data Cube

14 years 8 months ago

Download users.encs.concordia.ca

This paper presents a general methodology for the efﬁcient parallelization of existing data cube construction algorithms. We describe two different partitioning strategies, one for top-down and one for bottomup cube algorithms. Both partitioning strategies assign subcubes to individual processors in such a way that the loads assigned to the processors are balanced. Our methods reduce inter processor communication overhead by partitioning the load in advance instead of computing each individual group-by in parallel. Our partitioning strategies create a small number of coarse tasks. This allows for sharing of preﬁxes and sort orders between different group-by computations. Our methods enable code reuse by permitting the use of existing sequential (external memory) data cube algorithms for the subcube computations on each processor. This supports the transfer of optimized sequential data cube code to a parallel setting. The bottom-up partitioning strategy balances the number of single...

Frank K. H. A. Dehne, Todd Eavis, Susanne E. Hambr

Real-time Traffic

Data Cube | Data Cube Construction | Database | ICDT 2001 | Partitioning Strategies |

claim paper

Post Info
More Details (n/a)

Added	29 Jul 2010
Updated	29 Jul 2010
Type	Conference
Year	2001
Where	ICDT
Authors	Frank K. H. A. Dehne, Todd Eavis, Susanne E. Hambrusch, Andrew Rau-Chaplin

Comments (0)

Sciweavers

Parallelizing the Data Cube

Data Cube | Data Cube Construction | Database | ICDT 2001 | Partitioning Strategies |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers