Commodity computer clusters are often composed of hundreds of computing nodes. These generally off-the-shelf systems are not designed for high reliability. Node failures therefore...
Fault tolerance is a very important concern for critical high performance applications using the MPI library. Several protocols provide automatic and transparent fault detection a...
Pierre Lemarinier, Aurelien Bouteiller, Thomas H&e...
Parallel processors such as SIMD computers have been successfully used in various areas of high performance image and data processing. Due to their characteristics of highly regula...
Hybrid CMOS/non-CMOS memories, in short hybrid memories, have been lauded as future ultra-capacity data memories. Nonetheless, such memories are going to suffer from high degree o...
Distributed information systems are critical to the functioning of many businesses; designing them to be dependable is a challenging but important task. We report our experience i...
Jeremy Bryans, John S. Fitzgerald, Alexander Roman...