The use of several distinct recovery procedures is one of the techniques that can be used to ensure high availability and fault-tolerance of computer systems. This method has been...
Sergiy A. Vilkomir, David Lorge Parnas, Veena B. M...
Abstract--We present a fault tolerant task pool execution environment that is capable of performing fine-grain selective restart using a lightweight, distributed task completion tr...
James Dinan, Arjun Singri, P. Sadayappan, Sriram K...
The paper presents a hierarchical modeling approach of the N version programming in a real – time environment. The model is constructed in three layers. At the first layer we d...
Abstract. In order to construct and deploy massively multiagent systems, we must address one of the fundamental issues of distributed systems, the possibility of partial failures. ...
A case study of performance and dependability evaluation of fault-tolerant multiprocessors is presented. Two specific architectures are analyzed taking into account system functio...