Lightweight checkpointing for concurrent ML

13 years 11 months ago

Download www.cs.purdue.edu

Transient faults that arise in large-scale software systems can often be repaired by re-executing the code in which they occur. Ascribing a meaningful semantics for safe re-execution in multithreaded code is not obvious, however. For a thread to re-execute correctly a region of code, it must ensure that all other threads that have witnessed its unwanted effects within that region are also reverted to a meaningful earlier state. If not done properly, data inconsistencies and other undesirable behavior may result. However, automatically determining what constitutes a consistent global checkpoint is not straightforward since thread interactions are a dynamic property of the program. In this paper, we present a safe and efﬁcient checkpointing mechanism for Concurrent ML (CML) be used to recover from transient faults. We introduce a new linguistic abstraction called stabilizers that permits the speciﬁcation of per-thread monitors and the restoration of globally consistent checkpoints. ...

Lukasz Ziarek, Suresh Jagannathan

Real-time Traffic

Consistent Global Checkpoint | JFP 2010 | Meaningful Earlier State | Safe |

claim paper

Post Info
More Details (n/a)

Added	28 Jan 2011
Updated	28 Jan 2011
Type	Journal
Year	2010
Where	JFP
Authors	Lukasz Ziarek, Suresh Jagannathan

Comments (0)

Sciweavers

Lightweight checkpointing for concurrent ML

Consistent Global Checkpoint | JFP 2010 | Meaningful Earlier State | Safe |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers