Sciweavers

CASES
2008
ACM

A light-weight cache-based fault detection and checkpointing scheme for MPSoCs enabling relaxed execution synchronization

14 years 2 months ago
A light-weight cache-based fault detection and checkpointing scheme for MPSoCs enabling relaxed execution synchronization
While technology advances have made MPSoCs a standard architecture for embedded systems, their applicability is increasingly being challenged by dramatic increases in the amount of device failures that may occur during execution. Conventional fault tolerance techniques employ a duplication-andcomparison strategy to detect arbitrary execution faults, as well as a checkpointing-and-rollback strategy to recover from the faulty state. Comparison and checkpointing are performed either at task level, thus imposing a large amount of overhead in verifying and backing up memory pages, or at instruction level, thus necessitating a lock-step execution model which significantly limits the attainable performance. To overcome the shortcomings of both strategies, in this paper we propose a cache-based fault tolerance scheme wherein the comparison and checkpointing process is performed at the cache-memory interface. By allowing two processors that execute duplicated tasks to share a single data cache...
Chengmo Yang, Alex Orailoglu
Added 12 Oct 2010
Updated 12 Oct 2010
Type Conference
Year 2008
Where CASES
Authors Chengmo Yang, Alex Orailoglu
Comments (0)