Due to modern technology trends such as decreasing feature sizes and lower voltage levels, fault tolerance is becoming increasingly important in computing systems. Shared memory in modern multiprocessor systems is supported by cache coherence mechanisms. The correctness of cache coherence of the system is crucial for the data integrity. This work proposes an error detection scheme for snoopingbased cache coherence protocols. For the widely used MESI coherence protocol, the proposed method does not introduce any performance overhead. Only a limited amount of additional hardware is required. Existing systems can be easily extended to support the proposed technique. Almost all single faults that are able to affect data integrity in the system are covered, with the exception of a few very rare cases. Experimental results involving fault injection do not encounter any undetected faults leading to corrupted application output.
Demid Borodin, Ben H. H. Juurlink