Sciweavers

307 search results - page 3 / 62
» On the Integrity of Lightweight Checkpoints
Sort
View
ISCA
2002
IEEE
115views Hardware» more  ISCA 2002»
14 years 9 days ago
SafetyNet: Improving the Availability of Shared Memory Multiprocessors with Global Checkpoint/Recovery
We develop an availability solution, called SafetyNet, that uses a unified, lightweight checkpoint/recovery mechanism to support multiple long-latency fault detection schemes. At...
Daniel J. Sorin, Milo M. K. Martin, Mark D. Hill, ...
ENTCS
2007
113views more  ENTCS 2007»
13 years 7 months ago
Modular Checkpointing for Atomicity
Transient faults that arise in large-scale software systems can often be repaired by re-executing the code in which they occur. Ascribing a meaningful semantics for safe re-execut...
Lukasz Ziarek, Philip Schatz, Suresh Jagannathan
SIAMSC
2010
132views more  SIAMSC 2010»
13 years 5 months ago
New Algorithms for Optimal Online Checkpointing
Frequently, the computation of derivatives for optimizing time-dependent problems is based on the integration of the adjoint differential equation. For this purpose, the knowledge...
Philipp Stumm, Andrea Walther
SRDS
2003
IEEE
14 years 19 days ago
Raptor: Integrating Checkpoints and Thread Migration for Cluster Management
distributed shared-memory (SDSM) provides the abstraction necessary to run shared-memory applications on cost-effective parallel platforms such as clusters of workstations. Howeve...
Hazim Shafi, Evan Speight, John K. Bennett