Replication is a key strategy for improving locality, fault tolerance and availability in distributed systems. The paper focuses on distributed file systems and presents a system ...
We present a hybrid synthesis method for automatic addition of fault-tolerance to distributed programs. In particular, we automatically specify and add pre-synthesized fault-tolera...
Fault tolerance is a very important concern for critical high performance applications using the MPI library. Several protocols provide automatic and transparent fault detection a...
Pierre Lemarinier, Aurelien Bouteiller, Thomas H&e...
Abstract. In this paper, we present the mechanisms needed for Byzantine fault tolerant coordination of Web services atomic transactions. The mechanisms have been incorporated into ...
Mobile ad hoc networks can be leveraged to provide ubiquitous services capable of acquiring, processing, and sharing real-time information from the physical world. Unlike Internet...