Sciweavers

207 search results - page 18 / 42
» Fault Tolerance in the R-GMA Information and Monitoring Syst...
Sort
View
ICDCS
1999
IEEE
14 years 1 days ago
HiFi: A New Monitoring Architecture for Distributed Systems Management
With the increasing complexity of large-scale distributed (LSD) systems, an efficient monitoring mechanism has become an essential service for improving the performance and reliab...
Ehab S. Al-Shaer, Hussein M. Abdel-Wahab, Kurt Mal...
GLVLSI
2008
IEEE
204views VLSI» more  GLVLSI 2008»
14 years 2 months ago
NBTI resilient circuits using adaptive body biasing
Reliability has become a practical concern in today’s VLSI design with advanced technologies. In-situ sensors have been proposed for reliability monitoring to provide advance wa...
Zhenyu Qi, Mircea R. Stan
CORR
2010
Springer
94views Education» more  CORR 2010»
13 years 7 months ago
Unidirectional Error Correcting Codes for Memory Systems: A Comparative Study
In order to achieve fault tolerance, highly reliable system often require the ability to detect errors as soon as they occur and prevent the speared of erroneous information throu...
Muzhir Al-Ani, Qeethara Al-Shayea
DSN
2000
IEEE
14 years 4 days ago
Executable Assertions for Detecting Data Errors in Embedded Control Systems
In order to be able to tolerate the effects of faults, we must first detect the symptoms of faults, i.e. the errors. This paper evaluates the error detection properties of an erro...
Martin Hiller
NSDI
2010
13 years 9 months ago
MapReduce Online
MapReduce is a popular framework for data-intensive distributed computing of batch jobs. To simplify fault tolerance, many implementations of MapReduce materialize the entire outp...
Tyson Condie, Neil Conway, Peter Alvaro, Joseph M....