Most application level fault tolerance schemes in literature are non-adaptive in the sense that the fault tolerance schemes incorporated in applications are usually designed witho...
Zizhong Chen, Ming Yang, Guillermo A. Francia III,...
Abstract. In order to support the dependability analysis of a system under design in an early phase of the design process, so-called fault tolerance libraries can be created that c...
Data aggregation is a fundamental building block of modern distributed systems. Averaging based approaches, commonly designated gossip-based, are an important class of aggregation ...
This paper proposes a new low power cache architecture that utilizes fault tolerance to allow aggressively reduced voltage levels. The fault tolerant overhead circuits consume lit...
Mohammad A. Makhzan, Amin Khajeh Djahromi, Ahmed M...
We describe the software architecture, technical features, and performance of TICK (Transparent Incremental Checkpointer at Kernel level), a system-level checkpointer implemented ...