Faster Joins, Self Joins and Multi-Way Joins Using Join Indices

15 years 11 months ago

Download www.cs.brown.edu

We propose a new algorithm, called Stripe-join, for performing a join given a join index. Stripe-join is inspired by an algorithm called \Jive-join" developed by Li and Ross. Stripe-join makes a single sequential pass through each input relation, in addition to one pass through the join index and two passes through a set of temporary les that contain tuple identi ers but no input tuples. Stripe-join performs this e ciently even when the input relations are much larger than main memory, as long as the number of blocks in main memory is of the order of the square root of the number of blocks in the participating relations. Stripe-join is particularly e cient for self-joins. To our knowledge, Stripe-join is the rst algorithm that, given a join index and a relation signi cantly larger than main memory, can perform a selfjoin with just a single pass over the input relation and without storing input tuples in intermediate les. Almost all the I/O is sequential, thus minimizing the impac...

Hui Lei, Kenneth A. Ross

Real-time Traffic