The paper presents a new data partitioning algorithm for parallel computing on heterogeneous processors. Like traditional functional partitioning algorithms, the algorithm assumes ...
The performance skeleton of an application is a short running program whose performance in any scenario reflects the performance of the application it represents. Specifically, th...
We propose a model for describing and predicting the parallel performance of a broad class of parallel numerical software on distributed memory architectures. The purpose of this ...
Giuseppe Romanazzi, Peter K. Jimack, Christopher E...
GPUs have recently evolved into very fast parallel co-processors capable of executing general purpose computations extremely efficiently. At the same time, multi-core CPUs evolutio...
George Teodoro, Rafael Sachetto Oliveira, Olcay Se...
Accurate performance predictions are difficult to achieve for parallel applications executing on production distributed systems. Conventional point-valued performance parameters a...