As supercomputers are being built from an ever increasing number of processing elements, the effort required to achieve a substantial fraction of the system peak performance is con...
On SMP clusters, mixed mode collective MPI communications, which use shared memory communications within SMP nodes and point-to-point communications between SMP nodes, are more eļ...
Meng-Shiou Wu, Ricky A. Kendall, Kyle Wright, Zhao...
An asynchronous work-stealing implementation of dynamic load balance is implemented using Uniļ¬ed Parallel C (UPC) and evaluated using the Unbalanced Tree Search (UTS) benchmark ...
Several existing compiler transformations can help improve communication-computation overlap in MPI applications. However, traditional compilers treat calls to the MPI library as ...
Anthony Danalis, Lori L. Pollock, D. Martin Swany,...
Abstract - Three dimensional computed tomography is a computationally intensive procedure, requiring large amounts of R A M and processing power. Parallel methods for two dimension...
David A. Reimann, Vipin Chaudhary, Michael J. Flyn...