This paper shows how a state-of-the-art software distributed shared-memory (DSM) protocol can be efficiently extended to tolerate single-node failures. In particular, we extend a ...
Fault-tolerant (FT) distributed protocols (such as group membership, consensus, etc.) represent fundamental building blocks for many practical systems, e.g., the Google File System...
NAP, a detection and recovery based scheme for implementing fault-tolerant itinerant computations, is presented. We give the semantics for the scheme and describe a protocol that ...
Dag Johansen, Keith Marzullo, Fred B. Schneider, K...
A model was introduced in [Fraga97] for integrating replication techniques in heterogeneous systems. The model adopts a reflective structure based on the meta-object approach [10]...
Lau Cheuk Lung, Joni da Silva Fraga, Carlos Mazier...
Distributed information systems are critical to the functioning of many businesses; designing them to be dependable is a challenging but important task. We report our experience i...
Jeremy Bryans, John S. Fitzgerald, Alexander Roman...