— Fault tolerance in MPI becomes a main issue in the HPC community. Several approaches are envisioned from user or programmer controlled fault tolerance to fully automatic fault ...
Aurelien Bouteiller, Boris Collin, Thomas Hé...
The Distributed Shared Memory (DSM) model is designed to leverage the ease of programming of the shared memory paradigm, while enabling the highperformance by expressing locality ...
The InfiniBandTM Architecture (IBA) is a new promising I/O communication standard positioned for building clusters and System Area Networks (SANs). However, the IBA specification ...
With the latest high-end computing nodes combining shared-memory multiprocessing with hardware multithreading, new scheduling policies are necessary for workloads consisting of mu...
Robert L. McGregor, Christos D. Antonopoulos, Dimi...
The need to provide performance guarantee in high performance servers has long been neglected. Providing performance guarantee in current and future servers is difficult because ï...