Sciweavers

307 search results - page 1 / 62
» On the Integrity of Lightweight Checkpoints
Sort
View
HASE
2008
IEEE
14 years 1 months ago
On the Integrity of Lightweight Checkpoints
This paper proposes a lightweight checkpointing scheme for real-time embedded systems. The goal is to separate concerns by allowing applications to take checkpoints independently ...
Raul Barbosa, Johan Karlsson
SSS
2010
Springer
143views Control Systems» more  SSS 2010»
13 years 5 months ago
Lightweight Live Migration for High Availability Cluster Service
High availability is a critical feature for service clusters and cloud computing, and is often considered more valuable than performance. One commonly used technique to enhance the...
Bo Jiang, Binoy Ravindran, Changsoo Kim
ICPADS
2010
IEEE
13 years 5 months ago
Hybrid Checkpointing for MPI Jobs in HPC Environments
As the core count in high-performance computing systems keeps increasing, faults are becoming common place. Checkpointing addresses such faults but captures full process images ev...
Chao Wang, Frank Mueller, Christian Engelmann, Ste...
JFP
2010
107views more  JFP 2010»
13 years 5 months ago
Lightweight checkpointing for concurrent ML
Transient faults that arise in large-scale software systems can often be repaired by re-executing the code in which they occur. Ascribing a meaningful semantics for safe re-execut...
Lukasz Ziarek, Suresh Jagannathan
SC
2000
ACM
13 years 11 months ago
Scalable Fault-Tolerant Distributed Shared Memory
This paper shows how a state-of-the-art software distributed shared-memory (DSM) protocol can be efficiently extended to tolerate single-node failures. In particular, we extend a ...
Florin Sultan, Thu D. Nguyen, Liviu Iftode