Adapting to intermittent faults in multicore systems

14 years 2 months ago

Download pages.cs.wisc.edu

Future multicore processors will be more susceptible to a variety of hardware failures. In particular, intermittent faults, caused in part by manufacturing, thermal, and voltage variations, can cause bursts of frequent faults that last from several cycles to several seconds or more. Due to practical limitations of circuit techniques, costeffective reliability will likely require the ability to temporarily suspend execution on a core during periods of intermittent faults. We investigate three of the most obvious techniques for adapting to the dynamically changing resource availability caused by intermittent faults, and demonstrate their different system-level implications. We show that system software reconfiguration has very high overhead, that temporarily pausing execution on a faulty core can lead to cascading livelock, and that using spare cores has high faultfree cost. To remedy these and other drawbacks of the three baseline techniques, we propose using a thin hardware/firmware l...

Philip M. Wells, Koushik Chakraborty, Gurindar S.

Real-time Traffic

ASPLOS 2008 | Frequent Faults | Intermittent Faults | Keywords Intermittent Faults | Programming Languages |

claim paper

Post Info
More Details (n/a)

Added	12 Oct 2010
Updated	12 Oct 2010
Type	Conference
Year	2008
Where	ASPLOS
Authors	Philip M. Wells, Koushik Chakraborty, Gurindar S. Sohi

Comments (0)

Sciweavers

Adapting to intermittent faults in multicore systems

ASPLOS 2008 | Frequent Faults | Intermittent Faults | Keywords Intermittent Faults | Programming Languages |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers