Designing a distributed fault tolerance algorithm requires careful analysis of both fault models and diagnosis strategies. A system will fail if there are too many active faults, ...
Virtualization provides the possibility of whole machine migration and thus enables a new form of fault tolerance that is completely transparent to applications and operating syst...
In this paper, we use random-selection protocols in the full-information model to solve classical problems in distributed computing. Our main results are the following: • An O(l...
Shafi Goldwasser, Elan Pavlov, Vinod Vaikuntanatha...
Fault-tolerant time-triggered communication relies on the synchronization of local clocks. The startup problem is the problem of reaching a sufficient degree of synchronization a...