Practical Skew Handling in Parallel Joins

15 years 6 months ago

Download pages.cs.wisc.edu

We present an approach to dealing with skew in parallel joins in database systems. Our approach is easily implementable within current parallel DBMS, and performs well on skewed data without degrading the performance of the system on non-skewed data. The mainidea is to use multiplealgorithms,each specialized for a di erent degree ofskew, and to use a smallsample of the relations being joined to determine which algorithmis appropriate. We developed, implemented, and experimented with four new skew-handling parallel join algorithms one, which we call virtual processor range partitioning, was the clear winner in high skew cases, while traditional hybrid hash join was the clear winner in lower skew or no skew cases. We present experimental results from an implementation of all four algorithms on the Gamma parallel database machine. To our knowledge, these are the rst reported skew-handling numbers from an actual implementation.

David J. DeWitt, Jeffrey F. Naughton, Donovan A. S

Real-time Traffic

Current Parallel Dbms | Database | Parallel Join | Skew Cases | VLDB 1992 |

claim paper

Added	11 Aug 2010
Updated	11 Aug 2010
Type	Conference
Year	1992
Where	VLDB
Authors	David J. DeWitt, Jeffrey F. Naughton, Donovan A. Schneider, S. Seshadri

Sciweavers

Practical Skew Handling in Parallel Joins

Current Parallel Dbms | Database | Parallel Join | Skew Cases | VLDB 1992 |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers