We introduce a new deterministic parallel sorting algorithm based on the regular sampling approach. The algorithm uses only two rounds of regular all-to-all personalized communica...
We present and evaluate GPU Bucket Sort, a parallel deterministic sample sort algorithm for many-core GPUs. Our method is considerably faster than Thrust Merge (Satish et.al., Pro...
In this paper various algorithms for sorting on processor networks are considered. We focus on meshes, but the results can be generalized easily to other decomposable architecture...
Many sorting algorithms have been studied in the past, but there are only a few algorithms that can effectively exploit both SIMD instructions and threadlevel parallelism. In this...
We present a new deterministic sorting algorithm that interleaves the partitioning of a sample sort with merging. Sequentially, it sorts n elements in O(n log n) time cache-oblivi...