This paper presents a benchmark for dependablesystems. The benchmark consists of two metrics, number of catastrophic incidents and performance degradation, which are obtained by a...
The probability that a failure will occur before the end of the computation increases as the number of processors used in a high performance computing application increases. For l...
An increasing number of applications are being developed using distributed object computing (DOC) middleware, such as CORBA. Many of these applications require the underlying midd...
Aniruddha S. Gokhale, Balachandran Natarajan, Doug...
Abstract. In order to construct and deploy massively multiagent systems, we must address one of the fundamental issues of distributed systems, the possibility of partial failures. ...
As computational clusters increase in size, their mean-time-to-failure reduces. Typically checkpointing is used to minimize the loss of computation. Most checkpointing techniques, ...