Hard disk drive failures are rare but are often costly. The ability to predict failures is important to consumers, drive manufacturers, and computer system manufacturers alike. In...
Fault-tolerant programs are typically not only difficult to implement but also incur extra costs in terms of performance or resource consumption. Failures are typically relatively ...
Ilwoo Chang, Matti A. Hiltunen, Richard D. Schlich...
In this paper, a new paradigm for designing logic circuits with concurrent error detection (CED) is described. The key idea is to exploit the asymmetric soft error susceptibility ...
Abstract— Measurements have shown that network path failures occur frequently in the Internet and physical link failures can cause network instability in large scale and severity...
Three protocols for gossip-based failure detection services in large-scale heterogeneous clusters are analyzed and compared. The basic gossip protocol provides a means by which fai...