Component failure in large-scale IT installations is becoming an ever larger problem as the number of components in a single cluster approaches a million. In this paper, we presen...
—Changing source code in large software systems is complex and requires a good understanding of dependencies between software components. Modification to components with little ...
Thomas Zimmermann, Nachiappan Nagappan, Kim Herzig...
Commodity file systems trust disks to either work or fail completely, yet modern disks exhibit more complex failure modes. We suggest a new fail-partial failure model for disks, ...
Vijayan Prabhakaran, Lakshmi N. Bairavasundaram, N...
We study communication complexity of consensus in synchronous message-passing systems with processes prone to crashes. The goal in the consensus problem is to have all the nonfaul...
Bogdan S. Chlebus, Dariusz R. Kowalski, Michal Str...
— System-on-Chip designs often have a large number of timing domains. Communication between these domains requires synchronization, and the failure probabilities of these synchro...