The probability that a failure will occur before the end of the computation increases as the number of processors used in a high performance computing application increases. For l...
Constructing logical machines out of collections of physical machines is a well-known technique for improving the robustness and fault tolerance of distributed systems. We present...
Yair Amir, Brian A. Coan, Jonathan Kirsch, John La...
The web services architecture came as answers to the search for interoperability among applications. In recent years there has been a growing interest in deploying on the Internet...
Giuliana Teixeira Santos, Lau Cheuk Lung, Carlos M...
As computational clusters increase in size, their mean-time-to-failure reduces. Typically checkpointing is used to minimize the loss of computation. Most checkpointing techniques, ...
In this paper we tackle the problem of scheduling a periodic real-time system on identical multiprocessor platforms, moreover the tasks considered may fail with a given probabilit...