It has been shown that a small number of FPGAs can significantly accelerate certain computing tasks by up to two or three orders of magnitude. However, particularly intensive lar...
Arun Patel, Christopher A. Madill, Manuel Salda&nt...
The BlueGene/L supercomputer will consist of 65,536 dual-processor compute nodes interconnected by two high-speed networks: a three-dimensional torus network and a tree topology ne...
When using a shared memory multiprocessor, the programmer faces the selection of the portable programming model which will deliver the best performance. Even if he restricts his c...
Message passing via MPI is widely used in singleprogram, multiple-data (SPMD) parallel programs. Existing data-flow frameworks do not model the semantics of message-passing SPMD ...
Michelle Mills Strout, Barbara Kreaseck, Paul D. H...
This paper presents a new approach towards parallel I/O for message-passing (MPI) applications on clusters built with commodity hardware and an SCI interconnect: instead of using t...