As the scale of high performance computing (HPC) continues to grow, application fault resilience becomes crucial. To address this problem, we are working on the design of an adapt...
In recent years, exciting technological advances have been made in development of flexible electronics. These technologies offer the opportunity to weave computation, communicat...
Roozbeh Jafari, Foad Dabiri, Philip Brisk, Majid S...
: A middleware architecture named ROAFTS (Real-time Object-oriented Adaptive Fault Tolerance Support) is presented. ROAFTS is designed to support adaptive fault-tolerant execution ...
Abstract. As grids typically consist of autonomously managed subsystems with strongly varying resources, fault-tolerance forms an important aspect of the scheduling process of appl...
Maria Chtepen, Filip H. A. Claeys, Bart Dhoedt, Fi...
—Considerable work has been done on providing fault tolerance capabilities for different software components on largescale high-end computing systems. Thus far, however, these fa...
Rinku Gupta, Pete Beckman, Byung-Hoon Park, Ewing ...