A scalable hash ripple join algorithm

16 years 6 months ago

Download pages.cs.wisc.edu

Recently, Haas and Hellerstein proposed the hash ripple join algorithm in the context of online aggregation. Although the algorithm rapidly gives a good estimate for many join-aggregate problem instances, the convergence can be slow if the number of tuples that satisfy the join predicate is small or if there are many groups in the output. Furthermore, if memory overflows (for example, because the user allows the algorithm to run to completion for an exact answer), the algorithm degenerates to block ripple join and performance suffers. In this paper, we build on the work of Haas and Hellerstein and propose a new algorithm that (a) combines parallelism with sampling to speed convergence, and (b) maintains good performance in the presence of memory overflow. Results from a prototype implementation in a parallel DBMS show that its rate of convergence scales with the number of processors, and that when allowed to run to completion, even in the presence of memory overflow, it is competitive...

Gang Luo, Curt J. Ellmann, Peter J. Haas, Jeffrey

Real-time Traffic

Database | Hash Join Algorithm | Memory Overflow | Ripple Join Algorithm | SIGMOD 2002 |

claim paper

» Handling Data Skew in Multiprocessor Database Computers Using Partition Tuning

» Magic Square Scalable PeertoPeer Lookup Protocol Considering Peers Characteristics

» SINA Scalable Incremental Processing of Continuous Queries in Spatiotemporal Databases

» Efficient Processing of RDF Queries with Nested Optional Graph Patterns in an RDBMS

» Efficient Enumeration of Frequent Sequences

» Distributed Evaluation of Continuous Equijoin Queries over Large Structured Overlay Networ...

» Storing and Locating Mutable Data in Structured PeertoPeer Overlay Networks

» Mapreducemerge simplified relational data processing on large clusters

Post Info
More Details (n/a)

Added	08 Dec 2009
Updated	08 Dec 2009
Type	Conference
Year	2002
Where	SIGMOD
Authors	Gang Luo, Curt J. Ellmann, Peter J. Haas, Jeffrey F. Naughton

Comments (0)

Sciweavers

A scalable hash ripple join algorithm

Database | Hash Join Algorithm | Memory Overflow | Ripple Join Algorithm | SIGMOD 2002 |

Explore & Download

Productivity Tools

Sciweavers