With the rapid progress in science and technology, we find ubiquitous use of safety-critical systems in avionics, consumer electronics, and medical instruments. In such systems, u...
Abstract Early failure detection in motor pumps is an important issue in prediction maintenance. An efficient condition-monitoring scheme is capable of providing warning and predic...
Flavia Cristina Bernardini, Ana Cristina Bicharra ...
—In this paper we show that it is possible to implement a perfect failure detector P (one that detects all faulty processes if and only if those processes failed) in a non-synchr...
We investigate the problem of detecting termination of a distributed computation in systems where processes can fail by crashing. Specifically, when the communication topology is ...
Neeraj Mittal, Felix C. Freiling, Subbarayan Venka...
The productivity of HPC system is determined not only by their performance, but also by their reliability. The conventional method to limit the impact of failures is checkpointing...