Sciweavers

145 search results - page 5 / 29
» Non-intrusive System Level Fault-Tolerance
Sort
View
CCGRID
2006
IEEE
14 years 3 months ago
Proposal of MPI Operation Level Checkpoint/Rollback and One Implementation
With the increasing number of processors in modern HPC(High Performance Computing) systems, there are two emergent problems to solve. One is scalability, the other is fault tolera...
Yuan Tang, Graham E. Fagg, Jack Dongarra
ICCAD
2003
IEEE
152views Hardware» more  ICCAD 2003»
14 years 6 months ago
Dynamic Fault-Tolerance and Metrics for Battery Powered, Failure-Prone Systems
Emerging VLSI technologies and platforms are giving rise to systems with inherently high potential for runtime failure. Such failures range from intermittent electrical and mechan...
Phillip Stanley-Marbell, Diana Marculescu
ICCD
2002
IEEE
122views Hardware» more  ICCD 2002»
14 years 6 months ago
Using Offline and Online BIST to Improve System Dependability - The TTPC-C Example
Fault-tolerant distributed real-time systems are presently facing a lot of new challenges. Although many techniques provide effective masking of node failures on the architectural...
Andreas Steininger, Johann Vilanek
ICRE
1998
IEEE
14 years 2 months ago
Validating Requirements for Fault Tolerant Systems using Model Checking
Model checking is shown to be an effective tool in validating the behavior of a fault tolerant embedded spacecraft controller. The case study presented here at by judiciously abst...
Francis Schneider, Steve M. Easterbrook, John R. C...
ICCD
2004
IEEE
113views Hardware» more  ICCD 2004»
14 years 6 months ago
Toward an Integrated Design Methodology for Fault-Tolerant, Multiple Clock/Voltage Integrated Systems
Abstract - This paper describes a communicationcentric design methodology that addresses the fundamental challenges induced by the emergence of truly heterogeneous Systems-on-Chip ...
Radu Marculescu, Diana Marculescu, Larry T. Pilegg...