Sciweavers

1256 search results - page 4 / 252
» On Coordinated Checkpointing in Distributed Systems
Sort
View
CLUSTER
2005
IEEE
14 years 1 months ago
Transparent Checkpoint-Restart of Distributed Applications on Commodity Clusters
We have created ZapC, a novel system for transparent coordinated checkpoint-restart of distributed network applications on commodity clusters. ZapC provides a thin virtualization ...
Oren Laadan, Dan B. Phung, Jason Nieh
DEXA
2000
Springer
84views Database» more  DEXA 2000»
13 years 12 months ago
Protocol for Taking Object-Based Checkpoints
Object-based checkpoints are consistent in the object-based system but may be inconsistent according to the traditional message-based definition. We present a protocol for taking ...
Katsuya Tanaka, Makoto Takizawa
IPPS
2006
IEEE
14 years 1 months ago
Evaluating cooperative checkpointing for supercomputing systems
Cooperative checkpointing, in which the system dynamically skips checkpoints requested by applications at runtime, can exploit system-level information to improve performance and ...
Adam J. Oliner, Ramendra K. Sahoo
ICPADS
2010
IEEE
13 years 5 months ago
Hybrid Checkpointing for MPI Jobs in HPC Environments
As the core count in high-performance computing systems keeps increasing, faults are becoming common place. Checkpointing addresses such faults but captures full process images ev...
Chao Wang, Frank Mueller, Christian Engelmann, Ste...
JSW
2008
95views more  JSW 2008»
13 years 7 months ago
Architecture Support for Behavior-based Adaptive Checkpointing
Checkpointing is a commonly used approach to provide system fault-tolerance. However, using a constant checkpointing frequency may compromise the system's overall performance ...
Nianen Chen, Shangping Ren