Sciweavers

ICDCS
1997
IEEE
14 years 4 months ago
Distributed Recovery with K-Optimistic Logging
Fault-tolerance techniques based on checkpointing and message logging have been increasingly used in real-world applications to reduce service down-time. Most industrial applicati...
Yi-Min Wang, Om P. Damani, Vijay K. Garg
IPPS
2006
IEEE
14 years 6 months ago
Coordinated checkpoint from message payload in pessimistic sender-based message logging
Execution of MPI applications on Clusters and Grid deployments suffers from node and network failure that motivates the use of fault tolerant MPI implementations. Two category tec...
M. Aminian, Mohammad K. Akbari, Bahman Javadi