This paper describes a single-version algorithmic approach to design in fault tolerant computing in various computing systems by using static redundancy in order to mask transient...
SOCK is a process calculus for the modeling of Service Oriented systems recently extended with primitives for dynamic fault and compensation handling. In this paper we investigate...
Program slicing is a general, widely-used, and accepted technique applicable to different software engineering tasks including debugging, whereas model-based diagnosis is an AI te...
As the scale of cluster computing grows, it is becoming hard for long-running applications to complete without facing failures on large-scale clusters. To address this issue, chec...
In this paper, we propose a timing-reasoning algorithm to improve the resolution of delay fault diagnosis. In contrast to previous approaches which identify candidates by utilizin...