Sciweavers

282 search results - page 31 / 57
» Transient Fault Detectors
Sort
View
ISCA
2002
IEEE
115views Hardware» more  ISCA 2002»
14 years 14 days ago
SafetyNet: Improving the Availability of Shared Memory Multiprocessors with Global Checkpoint/Recovery
We develop an availability solution, called SafetyNet, that uses a unified, lightweight checkpoint/recovery mechanism to support multiple long-latency fault detection schemes. At...
Daniel J. Sorin, Milo M. K. Martin, Mark D. Hill, ...
DCOSS
2006
Springer
13 years 11 months ago
When Birds Die: Making Population Protocols Fault-Tolerant
In the population protocol model introduced by Angluin et al. [2], a collection of agents, which are modelled by finite state machines, move around unpredictably and have pairwise ...
Carole Delporte-Gallet, Hugues Fauconnier, Rachid ...
TPDS
2008
134views more  TPDS 2008»
13 years 7 months ago
Extending the TokenCMP Cache Coherence Protocol for Low Overhead Fault Tolerance in CMP Architectures
It is widely accepted that transient failures will appear more frequently in chips designed in the near future due to several factors such as the increased integration scale. On th...
Ricardo Fernández Pascual, José M. G...
DAC
2005
ACM
14 years 8 months ago
Fault and energy-aware communication mapping with guaranteed latency for applications implemented on NoC
As feature sizes shrink, transient failures of on-chip network links become a critical problem. At the same time, many applications require guarantees on both message arrival prob...
Sorin Manolache, Petru Eles, Zebo Peng
ASPLOS
2006
ACM
14 years 1 months ago
Understanding prediction-based partial redundant threading for low-overhead, high- coverage fault tolerance
Redundant threading architectures duplicate all instructions to detect and possibly recover from transient faults. Several lighter weight Partial Redundant Threading (PRT) archite...
Vimal K. Reddy, Eric Rotenberg, Sailashri Parthasa...