Distributed storage systems often use data replication to mask failures and guarantee high data availability. Node failures can be transient or permanent. While the system must ge...
Jing Tian, Zhi Yang, Wei Chen, Ben Y. Zhao, Yafei ...
We consider bit communication complexity of binary consensus in synchronous message passing systems with processes prone to crashes. A distributed algorithm is locally scalable wh...
Abstract. Control systems for electrical microgrids rely ever more on heterogeneous off-the-shelf technology for hardware, software and networking among the intelligent electronic ...
Faults in an IP network have various causes such as the failure of one or more routers at the IP layer, fiber-cuts, failure of physical elements at the optical layer, or extraneo...
Srikanth Kandula, Dina Katabi, Jean-Philippe Vasse...
In this paper, we study and model a crash-recovery target and its failure detector’s probabilistic behavior. We extend Quality of Service (QoS) metrics to measure the recovery d...