: We present a new approach to fault tolerance for High Performance Computing system. Our approach is based on a careful adaptation of the Algorithmic Based Fault Tolerance techniq...
George Bosilca, Remi Delmas, Jack Dongarra, Julien...
One of the topics of paramount importance in the development of Cluster and Grid middleware is the impact of faults since their occurrence in Grid infrastructures and in large-sca...
William Hoarau, Pierre Lemarinier, Thomas Hé...
In this paper we present the design and implementation of a Pluggable Fault Tolerant CORBA Infrastructure that provides fault tolerance for CORBA applications by utilizing the plu...
Wenbing Zhao, Louise E. Moser, P. M. Melliar-Smith
The following three questions should be answered in developing new topology with more powerful ability to tolerate node-failure in wireless sensor network. First, what is node-fai...