Previous implementations of out-of-core columnsort limit the problem size to N ≤ (M/P)3/2, where N is the number of records to sort, P is the number of processors, and M is the total number of records that the entire system can hold in its memory (so that M/P is the number of records that a single processor can hold in its memory). We implemented two variations to out-of-core columnsort that relax this restriction. Subblock columnsort is based on an algorithmic modification of the underlying columnsort algorithm, and it improves the problem-size bound to N ≤ (M/P)5/3/42/3 but at the cost of additional disk I/O. M-columnsort changes the notion of the column size in columnsort, improving the maximum problem size to N ≤ M3/2 but at the cost of additional computation and communication. Experimental results on a Beowulf cluster show that both subblock columnsort and M-columnsort run well but that M-columnsort is faster. A further advantage of M-columnsort is that it handles a wider ...
Geeta Chaudhry, Elizabeth A. Hamon, Thomas H. Corm