In this paper, we present a new fault tolerance system called DejaVu for transparent and automatic checkpointing, migration, and recovery of parallel and distributed applications....
Joseph F. Ruscio, Michael A. Heffner, Srinidhi Var...
We initiate a complexity-theoretic treatment of hardness amplification for collision-resistant hash functions, namely the transformation of weakly collision-resistant hash functio...
Ran Canetti, Ronald L. Rivest, Madhu Sudan, Luca T...
As modern supercomputing systems reach the peta-flop performance range, they grow in both size and complexity. This makes them increasingly vulnerable to failures from a variety o...
Greg Bronevetsky, Daniel Marques, Keshav Pingali, ...
Abstract. Sensor relocation protocols can be employed as fault tolerance approach to offset the coverage loss caused by node failures. We introduce a novel localized structure, in...
— The potential of truly large scale grids can only be realized with grid architectures and deployment strategies that lower the need for human administrative intervention, and t...