In this paper, we describe the design and implementation of two mechanisms for fault-tolerance and recovery for complex scientific workflows on computational grids. We present our ...
To date, there is little evidence that modular reasoning about fault-tolerant systems can simplify the verification process in practice. We study this question using a prominent e...
—As semiconductor technology scales into the deep submicron regime the occurrence of transient or soft errors will increase. This will require new approaches to error detection. ...
Today organizations and business enterprises of all sizes need to deal with unprecedented amounts of digital information, creating challenging demands for mass storage and on-dema...
Sangeetha Seshadri, Ling Liu, Brian F. Cooper, Law...
This paper presents an approach for integrating fault-tolerance techniques into microprocessors by utilizing instruction redundancy as well as time redundancy. Smaller and smaller...