Many large-scale production applications often have very long executions times and require periodic data checkpoints in order to save the state of the computation for program rest...
Wei-keng Liao, Avery Ching, Kenin Coloma, Alok N. ...
Application development for distributed-computing ``Grids'' can benefit from tools that variously hide or enable application-level management of critical aspects of the ...
Nicholas T. Karonis, Brian R. Toonen, Ian T. Foste...
Abstract. An MPI library, called MPICH-PM/CLUMP, has been implemented on a cluster of SMPs. MPICH-PM/CLUMP realizes zero copy message passing between nodes while using one copy mes...
Toshiyuki Takahashi, Francis O'Carroll, Hiroshi Te...
Abstract—The increasing number of cores per node in highperformance computing requires an efficient intra-node MPI communication subsystem. Most existing MPI implementations rel...
As new processor and memory architectures advance, clusters start to be built from larger SMP systems, which makes MPI intra-node communication a critical issue in high performanc...