The research communityhas considered hash-based parallel joinalgorithmsthe algorithmsof choice for almosta decade. However, almostnone ofthe commercialparallel database systems use hashing-based join algorithms, usinginstead nested-loops withindex or sort-merge. While the research literature abounds with comparisons between the various hash-based and sort-merge join algorithms, to our knowledge there is no published comparison between the parallel hash-based algorithms and a parallel nested loops algorithm with index. In this paper we present a comparison of four variants of parallel index nested loops algorithms with the parallel hybrid hash algorithm. The conclusions of our experiments both with an analytic model and with an implementation in the Gammaparallel database system are that (1) overall, parallel hybrid hash is the method of choice, but (2) there are cases where nested-loops with index wins big enough that systems could pro t from implementing both algorithms. Furthermore,...
David J. DeWitt, Jeffrey F. Naughton, Joseph Burge