Widespread adaptation of shared memory programming for High Performance Computing has been inhibited by a lack of standardization and the resulting portability problems between pl...
Abstract. Wavefront computations are common in scientific applications. Although it is well understood how wavefronts are pipelined for parallel execution, the question remains: H...
This paper describes the design and the implementation of parallel routines in the Heterogeneous ScaLAPACK library that solve a dense system of linear equations. This library is w...
Ravi Reddy Manumachu, Alexey L. Lastovetsky, Pedro...
Parallel direct execution simulation is an important tool for performance and scalability analysis of large message passing parallel programs executing on top of a virtual compute...
Phillip M. Dickens, Matthew Haines, Piyush Mehrotr...
In this paper, we present techniques and algorithms to improve the performance of various communication patterns on message-passing platforms where, for reasons of safety, user-le...