CMOS technology trends are leading to an increasing incidence of hard (permanent) faults in processors. These faults may be introduced at fabrication or occur in the field. Wherea...
The use of good random numbers is essential to the integrity of many mission-critical systems. However, when such systems are replicated for Byzantine fault tolerance, a serious i...
The emergence of computational grids has lead to an increased reliance on task schedulers that can guarantee the completion of tasks that are executed on unreliable systems. There...
Daniel C. Vanderster, Nikitas J. Dimopoulos, Randa...
As the scale of high performance computing (HPC) continues to grow, application fault resilience becomes crucial. To address this problem, we are working on the design of an adapt...
A commonly used model for fault-tolerant computation is that of cellular automata. The essential difficulty of fault-tolerant computation is present in the special case of simply ...