The increasing transient fault rate will necessitate onchip fault tolerance techniques in future processors. The speed gap between the processor and the memory is also increasing, causing the processor to stay idle for hundreds of cycles while waiting for a long-latency cache miss to be serviced. Even in the presence of aggressive prefetching techniques, future processors are expected to waste significant processing bandwidth waiting for main memory. This paper proposes Microarchitecture-Based Introspection (MBI), a transient-fault detection technique, which utilizes the wasted processing bandwidth during long-latency cache misses for redundant execution of the instruction stream. MBI has modest hardware cost, requires minimal modifications to the existing microarchitecture, and is particularly well suited for memory-intensive applications. Our evaluation reveals that the time redundancy of MBI results in an average IPC reduction of only 7.1% for memoryintensive benchmarks in the SP...
Moinuddin K. Qureshi, Onur Mutlu, Yale N. Patt