The general approach to fault tolerance in uniprocessor systems is to maintain enough time redundancy in the schedule so that any task instance can be re-executed in presence of f...
Commodity computer clusters are often composed of hundreds of computing nodes. These generally off-the-shelf systems are not designed for high reliability. Node failures therefore...
In this paper we tackle the problem of scheduling a periodic real-time system on identical multiprocessor platforms, moreover the tasks considered may fail with a given probabilit...
Abstract. As grids typically consist of autonomously managed subsystems with strongly varying resources, fault-tolerance forms an important aspect of the scheduling process of appl...
Maria Chtepen, Filip H. A. Claeys, Bart Dhoedt, Fi...
— In this paper, we present a distributed computing framework designed to support higher quality of service and fault tolerance for processing deadline-driven tasks in a P2P envi...