Record and Replay (RR) is a software based state replication solution designed to support recording and subsequent replay of the execution of unmodified applications running on mu...
Philippe Bergheaud, Dinesh Subhraveti, Marc Vertes
This paper presents a component model for building distributed applications with fault-tolerance requirements. The AFT-CCM model selects the configuration of replicated services d...
Fault-tolerant programs are typically not only difficult to implement but also incur extra costs in terms of performance or resource consumption. Failures are typically relatively ...
Ilwoo Chang, Matti A. Hiltunen, Richard D. Schlich...
We present Zyzzyva, a protocol that uses speculation to reduce the cost and simplify the design of Byzantine fault tolerant state machine replication. In Zyzzyva, replicas respond...
Ramakrishna Kotla, Lorenzo Alvisi, Michael Dahlin,...
A major challenge facing grid applications is the appropriate handling of failures. In this paper we address the problem of making parallel Java applications based on Remote Method...