Sciweavers

SIGMOD
2002
ACM

A scalable hash ripple join algorithm

15 years 17 days ago
A scalable hash ripple join algorithm
Recently, Haas and Hellerstein proposed the hash ripple join algorithm in the context of online aggregation. Although the algorithm rapidly gives a good estimate for many join-aggregate problem instances, the convergence can be slow if the number of tuples that satisfy the join predicate is small or if there are many groups in the output. Furthermore, if memory overflows (for example, because the user allows the algorithm to run to completion for an exact answer), the algorithm degenerates to block ripple join and performance suffers. In this paper, we build on the work of Haas and Hellerstein and propose a new algorithm that (a) combines parallelism with sampling to speed convergence, and (b) maintains good performance in the presence of memory overflow. Results from a prototype implementation in a parallel DBMS show that its rate of convergence scales with the number of processors, and that when allowed to run to completion, even in the presence of memory overflow, it is competitive...
Gang Luo, Curt J. Ellmann, Peter J. Haas, Jeffrey
Added 08 Dec 2009
Updated 08 Dec 2009
Type Conference
Year 2002
Where SIGMOD
Authors Gang Luo, Curt J. Ellmann, Peter J. Haas, Jeffrey F. Naughton
Comments (0)