Linear algebra algorithms are fundamental to many computing applications. Modern GPUs are suited for many general purpose processing tasks and have emerged as inexpensive high per...
An attractive paradigm for building fast numerical algorithms is the following: 1 try a fast but occasionally unstable algorithm, 2 test the accuracy of the computed answer, a...
To help programmers of high-performance computing (HPC) systems avoid communication-related errors, we employ a formal process algebra, Communicating Sequential Processes (CSP), w...
Sample sort, a generalization of quicksort that partitions the input into many pieces, is known as the best practical comparison based sorting algorithm for distributed memory para...
As the desire of scientists to perform ever larger computations drives the size of today’s high performance computers from hundreds, to thousands, and even tens of thousands of ...