Fault tolerance is a major concern to guarantee availability of critical services as well as application execution. Traditional approaches for fault tolerance include checkpoint/r...
: Intrusion tolerance is a recent approach to deal with intentional and malicious failures. It combines the research on fault tolerance with the research on security, and relies on...
In this paper, we describe a proactive recovery scheme based on service migration for long-running Byzantine fault tolerant systems. Proactive recovery is an essential method for ...
This paper describes an asynchronous state-machine replication system that tolerates Byzantine faults, which can be caused by malicious attacks or software errors. Our system is t...
Large-scale parallel computing is relying increasingly on clusters with thousands of processors. At such large counts of compute nodes, faults are becoming common place. Current t...
Arun Babu Nagarajan, Frank Mueller, Christian Enge...