Sciweavers

33 search results - page 5 / 7
» Improving the Fault Tolerance of a Computer System with Spac...
Sort
View
CCGRID
2006
IEEE
14 years 2 months ago
Closing Cluster Attack Windows Through Server Redundancy and Rotations
— It is well-understood that increasing redundancy in a system generally improves the availability and dependability of the system. In server clusters, one important form of redu...
Yih Huang, David Arsenault, Arun Sood
DSN
2009
IEEE
14 years 3 months ago
Low overhead Soft Error Mitigation techniques for high-performance and aggressive systems
The threat of soft error induced system failure in high performance computing systems has become more prominent, as we adopt ultra-deep submicron process technologies. In this pap...
Naga Durga Prasad Avirneni, Viswanathan Subramania...
SIGMETRICS
2008
ACM
121views Hardware» more  SIGMETRICS 2008»
13 years 8 months ago
Disk scrubbing versus intra-disk redundancy for high-reliability raid storage systems
Two schemes proposed to cope with unrecoverable or latent media errors and enhance the reliability of RAID systems are examined. The first scheme is the established, widely used d...
Ilias Iliadis, Robert Haas, Xiao-Yu Hu, Evangelos ...
IESS
2007
Springer
120views Hardware» more  IESS 2007»
14 years 2 months ago
Error Containment in the Time-Triggered System-On-a-Chip Architecture
Abstract: The time-triggered System-on-a-Chip (SoC) architecture provides a generic multicore system platform for a family of composable and dependable giga-scale SoCs. It supports...
Roman Obermaisser, Hermann Kopetz, Christian El Sa...
CDES
2006
101views Hardware» more  CDES 2006»
13 years 10 months ago
Hybrid Error-Detection Approach with No Detection Latency for High-Performance Microprocessors
- Error detection plays an important role in fault-tolerant computer systems. Two primary parameters concerned for error detection are the latency and coverage. In this paper, a ne...
Yung-Yuan Chen, Kuen-Long Leu, Li-Wen Lin