Abstract. Most parallel systems on which MPI is used are now hierarchical: some processors are much closer to others in terms of interconnect performance. One of the most common su...
Hao Zhu, David Goodell, William Gropp, Rajeev Thak...
The BlueGene/L supercomputer will consist of 65,536 dual-processor compute nodes interconnected by two high-speed networks: a three-dimensional torus network and a tree topology ne...
As new processor and memory architectures advance, clusters start to be built from larger SMP systems, which makes MPI intra-node communication a critical issue in high performanc...
Due to the complexity associated with developing parallel applications, scientists and engineers rely on highlevel software libraries such as PETSc, ScaLAPACK and PESSL to ease th...
Pavan Balaji, Darius Buntinas, Satish Balay, Barry...
— This paper focuses on the transfer of large data in SMP systems. Achieving good performance for intranode communication is critical for developing an efficient communication s...