ÐWe study the quality of service (QoS) of failure detectors. By QoS, we mean a specification that quantifies 1) how fast the failure detector detects actual failures and 2) how we...
This paper presents a generic methodology to transform a protocol resilient to process crashes into one resilient to arbitrary failures in the case where processes run the same te...
Distributed applications can fail in subtle ways that depend on the state of multiple parts of a system. This complicates the validation of such systems via fault injection, since...
Ramesh Chandra, Ryan M. Lefever, Michel Cukier, Wi...
This paper tests the hypothesis that generic recovery techniques, such as process pairs, can survive most application faults without using application-specific information. We ex...
Recent work has shown the advantages of using persistent memory for transaction processing. In particular, the Vista transaction system uses recoverable memory to avoid disk I/O, ...
Group communication systems are proven tools upon which to build fault-tolerant systems. As the demands for fault-tolerance increase and more applications require reliable distrib...
Yair Amir, Claudiu Danilov, Jonathan Robert Stanto...
Byzantine quorum systems [13] enhance the availability and efficiency of fault-tolerant replicated services when servers may suffer Byzantine failures. An important limitation of...
Lorenzo Alvisi, Evelyn Tumlin Pierce, Dahlia Malkh...
We describe an methodology for testing a software system for possible security flaws. Based on the observation that most security flaws are caused by the program’s inappropria...