Process scaling has given designers billions of transistors to work with. As feature sizes near the atomic scale, extensive variation and wearout inevitably make margining unecono...
David Fick, Andrew DeOrio, Jin Hu, Valeria Bertacc...
The term “Self-healing” denotes the capability of a software system in dealing with bugs. Fault tolerance for dependable computing is to provide the specified service through ...
Two schemes proposed to cope with unrecoverable or latent media errors and enhance the reliability of RAID systems are examined. The first scheme is the established, widely used d...
Ilias Iliadis, Robert Haas, Xiao-Yu Hu, Evangelos ...
This paper presents an approach to system-level optimization of error detection implementation in the context of fault-tolerant realtime distributed embedded systems used for safe...
Adrian Lifa, Petru Eles, Zebo Peng, Viacheslav Izo...
Distributed key-value stores are now a standard component of high-performance web services and cloud computing applications. While key-value stores offer significant performance...