Fault-Tolerant Distributed Simulation

14 years 7 months ago

Download www.it.iitb.ac.in

In traditional distributed simulation schemes, entire simulation needs to be restarted if any of the participating LP crashes. This is highly undesirable for long running simulations. Some form of fault-tolerance is required to minimize the wasted computation. In this paper, a rollback based optimistic faulttolerance scheme is integrated with an optimistic distributed simulation scheme. In rollback recovery schemes, checkpoints are periodically saved on stable storage. After a crash, these saved checkpoints are used to restart the computation. We make use of the novel insight that a failure can be modeled as a straggler event with the receive time equal to the virtual time of the last checkpoint saved on stable storage. This results in saving of implementation e orts, as well as reduced overheads. We de ne stable global virtual time SGVT, as the virtual time such that no state with a lower timestamp will ever be rolled back despite crash failures. A simple change is made in existing...

Om P. Damani, Vijay K. Garg

Real-time Traffic

Modelling And Simulation | PADS 1998 | Simulation Scheme | Stable Storage | Virtual Time |

claim paper

Post Info
More Details (n/a)

Added	05 Aug 2010
Updated	05 Aug 2010
Type	Conference
Year	1998
Where	PADS
Authors	Om P. Damani, Vijay K. Garg

Comments (0)

Sciweavers

Fault-Tolerant Distributed Simulation

Modelling And Simulation | PADS 1998 | Simulation Scheme | Stable Storage | Virtual Time |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers