History repeats itself. Since the invention of the programmable computer, numerous computer scientists keep dedicating their professional lives to the design of “the single, bes...
This paper examines two Network Interface Card microarchitectures that support low latency, high bandwidth userlevel message passing in multi-user environments. The two are at dif...
Boon Seong Ang, Derek Chiou, Larry Rudolph, Arvind
Over the last decade, Message Passing Interface (MPI) has become a very successful parallel programming environment for distributed memory architectures such as clusters. However, ...
We have taken a NIST molecular dynamics simulation program (md3), which was configured as a single sequential process running on a CRAY C90 vector supercomputer, and parallelized ...
We describe and evaluate a new, pipelined algorithm for large, irregular all-gather problems. In the irregular all-gather problem each process in a set of processes contributes in...