Sciweavers

4 search results - page 1 / 1
» Evaluating cooperative checkpointing for supercomputing syst...
Sort
View
IPPS
2006
IEEE
14 years 1 months ago
Evaluating cooperative checkpointing for supercomputing systems
Cooperative checkpointing, in which the system dynamically skips checkpoints requested by applications at runtime, can exploit system-level information to improve performance and ...
Adam J. Oliner, Ramendra K. Sahoo
DSN
2005
IEEE
14 years 1 months ago
Probabilistic QoS Guarantees for Supercomputing Systems
Supercomputing systems must be able to reliably and efficiently complete their assigned workloads, even in the presence of failures. This paper proposes a system that allows the ...
Adam J. Oliner, Larry Rudolph, Ramendra K. Sahoo, ...
ICDCS
2011
IEEE
12 years 7 months ago
Provisioning a Multi-tiered Data Staging Area for Extreme-Scale Machines
—Massively parallel scientific applications, running on extreme-scale supercomputers, produce hundreds of terabytes of data per run, driving the need for storage solutions to im...
Ramya Prabhakar, Sudharshan S. Vazhkudai, Youngjae...
PVM
2005
Springer
14 years 1 months ago
Cooperative Write-Behind Data Buffering for MPI I/O
Many large-scale production parallel programs often run for a very long time and require data checkpoint periodically to save the state of the computation for program restart and/o...
Wei-keng Liao, Kenin Coloma, Alok N. Choudhary, Le...