Sciweavers

2400 search results - page 54 / 480
» Systems Failures
Sort
View
SSS
2007
Springer
14 years 1 months ago
Global Predicate Detection in Distributed Systems with Small Faults
Abstract. We study the problem of global predicate detection in presence of permanent and transient failures. We term the transient failures as small faults. We show that it is imp...
Felix C. Freiling, Arshad Jhumka
IWNAS
2008
IEEE
14 years 2 months ago
Optimal Implementation of Continuous Data Protection (CDP) in Linux Kernel
To protect data and recover data in case of failures, Linux operating system has built-in MD device that implements RAID architectures. Such device can recover data in case of sin...
Xu Li, Changsheng Xie, Qing Yang
ICML
2001
IEEE
14 years 8 months ago
Bayesian approaches to failure prediction for disk drives
Hard disk drive failures are rare but are often costly. The ability to predict failures is important to consumers, drive manufacturers, and computer system manufacturers alike. In...
Greg Hamerly, Charles Elkan
WDAG
1997
Springer
92views Algorithms» more  WDAG 1997»
13 years 11 months ago
Heartbeat: A Timeout-Free Failure Detector for Quiescent Reliable Communication
Abstract. We study the problem of achieving reliable communication with quiescent algorithms (i.e., algorithms that eventually stop sending messages) in asynchronous systems with p...
Marcos Kawazoe Aguilera, Wei Chen, Sam Toueg
PODC
1994
ACM
13 years 11 months ago
A Checkpoint Protocol for an Entry Consistent Shared Memory System
Workstation clusters are becoming an interesting alternative to dedicated multiprocessors. In this environment, the probability of a failure, during an application's executio...
Nuno Neves, Miguel Castro, Paulo Guedes