In this paper, we describe the design and implementation of two mechanisms for fault-tolerance and recovery for complex scientific workflows on computational grids. We present our ...
Initial versions of MPI were designed to work efficiently on multi-processors which had very little job control and thus static process models. Subsequently forcing them to suppor...
Fault-tolerant time-triggered communication relies on the synchronization of local clocks. The startup problem is the problem of reaching a sufficient degree of synchronization a...
Abstract. The exploitation of the physical characteristics has already been demonstrated in the intrinsic evolution of electronic circuits. This paper is an initial attempt at crea...
This paper presents a new fault-tolerance technique for cache memories. Current fault-tolerance techniques for caches are limited either by the number of faults that can be tolera...