A popular approach to guarantee fault tolerance in safety-critical applications is to run the application on two processors. A checkpoint is inserted at the completion of the prim...
— Fault tolerance in MPI becomes a main issue in the HPC community. Several approaches are envisioned from user or programmer controlled fault tolerance to fully automatic fault ...
Aurelien Bouteiller, Boris Collin, Thomas Hé...
As more data value speculation mechanisms are being proposed to speed-up processors, there is growing pressure on the critical processor structures that must buffer the state of t...
Software distributed shared memory (DSM) improves the programmability of message-passing machines and workclusters by providing a shared memory abstract (i.e., a coherent global a...
Data Stream Management (DSM) addresses the continuous processing of sensor data. DSM requires the combination of stream operators, which may run on different distributed devices, ...
Gert Brettlecker, Heiko Schuldt, Peter M. Fischer,...