As our reliance on computers increases, so does the need for robust software. Previous studies have shown that many C libraries exhibit robustness problems due to exceptional inpu...
Gossip-based broadcast algorithms have been considered as a viable alternative to traditional deterministic reliable broadcast algorithms in large scale environments. However, the...
This paper shows that, in an environment where we do not bound the number of faulty processes, the class P of Perfect failure detectors is the weakest (among realistic failure det...
Software developers identify two main reasons why software systems are not made robust: performance and practicality. This work demonstrates the effectiveness of general technique...
Protocols which solve agreement problems are essential building blocks for fault tolerant distributed applications. While many protocols have been published, little has been done ...
Traditional problem determination techniques rely on static dependency models that are difficult to generate accurately in today’s large, distributed, and dynamic application e...
Mike Y. Chen, Emre Kiciman, Eugene Fratkin, Armand...
The NASA Remote Exploration and Experimentation (REE) Project, managed by the Jet Propulsion Laboratory, has the vision of bringing commercial supercomputing technology into space...
This paper reports work to support dependability arguments about the future reliability of a product before there is direct empirical evidence. We develop a method for estimating ...