In this paper, we describe the design and implementation of two mechanisms for fault-tolerance and recovery for complex scientific workflows on computational grids. We present our ...
This paper analyzes the performability of client-server applications that use a separate fault management architecture for monitoring and controlling of the status of the applicat...
—Fault injection (FI) has been shown to be an effective approach to assessing the dependability of software systems. To determine the impact of faults injected during FI, a given...
: The distributed recovery block (DRB) scheme is a widely applicable approach for realizing both hardware and software fault tolerance in real-time distributed and parallel compute...
Using decentralized control structures for robot control can offer a lot of advantages, such as less complexity, better fault tolerance and more flexibility. In this paper the ev...