Database systems support recovery, providing high database availability. However, database applications may lose work because of a server failure. In particular, if a database serv...
Roger S. Barga, David B. Lomet, Thomas Baby, Sanja...
Although conventional database management systems are designed to tolerate hardware and to a lesser extent even software errors, they cannot protect themselves against syntactical...
Fault tolerance is a very important concern for critical high performance applications using the MPI library. Several protocols provide automatic and transparent fault detection a...
Pierre Lemarinier, Aurelien Bouteiller, Thomas H&e...
The probability of failures in software distributed shared memory (SDSM) increases as the system size grows. This paper introduces a new, efficient message logging technique, call...
Software distributed shared memory (DSM) improves the programmability of message-passing machines and workclusters by providing a shared memory abstract (i.e., a coherent global a...