Abstract— Remote Direct Memory Access (RDMA) and pointto-point network fabrics both have their own advantages. MPI middleware implementations typically use one or the other, howe...
Networks of workstations are fast becoming the standard environment for parallel applications. However, the use of “found” resources as a platform for tightly-coupled runtime ...
Concurrent programming errors arise when threads share data incorrectly. Programmers often avoid these errors by using synchronization to enforce a simple ownership policy: data i...
Jean-Phillipe Martin, Michael Hicks, Manuel Costa,...
Stride-based prefetching mechanisms exploit regular streams of memory accesses to hide memory latency. While these mechanisms are effective, they can be improved by studying the p...
The growth in complexity of modern systems makes it increasingly difficult to extract high-performance. The software stacks for such systems typically consist of multiple layers a...