Many state-of-the-art approaches on fault-tolerant system design make the simplifying assumption that all faults are detected within a certain time interval. However, based on a d...
Jia Huang, Kai Huang, Andreas Raabe, Christian Buc...
: This paper presents the results from running five experiments with the Chime Parallel Processing System. The Chime System is an implementation of the CC++ programming language (p...
Anjaneya R. Chagam, Partha Dasgupta, Rajkumar Khan...
Traditional agreement-based Byzantine fault-tolerant (BFT) systems process all requests on all replicas to ensure consistency. In addition to the overhead for BFT protocol and sta...
ST-TCP (Server fault-Tolerant TCP) is an extension of TCP to tolerate TCP server failures. Server fault tolerance is provided by using an active-backup server that keeps track of ...
Replication is a key strategy for improving locality, fault tolerance and availability in distributed systems. The paper focuses on distributed file systems and presents a system ...