To be able to fully exploit ever larger computing platforms, modern HPC applications and system software must be able to tolerate inevitable faults. Historically, MPI implementati...
Joshua Hursey, Jeffrey M. Squyres, Timothy Mattox,...
In a distributed system, process synchronization is an important agenda. One of the major duties for process synchronization is mutual exclusion. This paper presents a new central...
Moharram Challenger, Vahid Khalilpour, Peyman Baya...
ing Abstraction to Improve Fault Tolerance MIGUEL CASTRO Microsoft Research and RODRIGO RODRIGUES and BARBARA LISKOV MIT Laboratory for Computer Science Software errors are a major...
The general approach to fault tolerance in uniprocessor systems is to maintain enough time redundancy in the schedule so that any task instance can be re-executed in presence of f...
We present a new approach that is able to produce an increased fault tolerance in bio-inspired electronic circuits. To this end, we designed hardwarefriendly genetic regulatory net...