: A general purpose group communication protocol suite called Newtop is described. It is assumed that processes can simultaneously belong to many groups, group size could be large,...
- Partitionability allows the creation of many physically independent subsystems, each of which retains an identical functionality as its parent network and has no communication in...
As computational clusters increase in size, their mean-time-to-failure reduces. Typically checkpointing is used to minimize the loss of computation. Most checkpointing techniques, ...
Record and Replay (RR) is a software based state replication solution designed to support recording and subsequent replay of the execution of unmodified applications running on mu...
Philippe Bergheaud, Dinesh Subhraveti, Marc Vertes
A fault ring is a connection of only nonfaulty adjacent nodes and links such that the interior of the ring contains only faulty components. This paper proposes two wormhole routin...