We present the design and development of a Performance Data Representation System (PDRS) for scalable parallel computing. PDRS provides decision support that helps users find the r...
This paper discusses the design and the implementation of the LU factorization routines included in the Heterogeneous ScaLAPACK library, which is built on top of ScaLAPACK. These ...
Ravi Reddy Manumachu, Alexey L. Lastovetsky, Pedro...
Abstract— This paper introduces a novel architecture for performing the core computations required by dynamic programming (DP) techniques. The latter pertain to a vast range of a...
Abstract—Dynamic runtimes can simplify parallel programming by automatically managing concurrency and locality without further burdening the programmer. Nevertheless, implementin...
Richard M. Yoo, Anthony Romano, Christos Kozyrakis
This paper presents an architecture which is suitable for a massive parallelization of the compact genetic algorithm. The approach is scalable, has low synchronization costs, and i...