Sciweavers

1256 search results - page 14 / 252
» On Coordinated Checkpointing in Distributed Systems
Sort
View
ISCC
2002
IEEE
14 years 1 months ago
Session level rollback recovery
The problem of rollback recovery is traditionally approached using a model oriented to packet delivery. Instead, we introduce a model centered around complex sessions, and we expl...
Augusto Ciuffoletti
ICDCS
2002
IEEE
14 years 1 months ago
Process Migration: A Generalized Approach Using a Virtualizing Operating System
Process migration has been used to perform specialized tasks, such as load sharing and checkpoint/restarting long running applications. Implementation typically consists of modifi...
Tom Boyd, Partha Dasgupta
CLUSTER
2005
IEEE
14 years 2 months ago
Minimizing the Network Overhead of Checkpointing in Cycle-harvesting Cluster Environments
Cycle-harvesting systems such as Condor have been developed to make desktop machines in a local area (which are often similar to clusters in hardware configuration) available as ...
Daniel Nurmi, John Brevik, Richard Wolski
DSN
2004
IEEE
14 years 15 days ago
Optimal Object State Transfer - Recovery Policies for Fault Tolerant Distributed Systems
Recent developments in the field of object-based fault tolerance and the advent of the first OMG FTCORBA compliant middleware raise new requirements for the design process of dist...
Panagiotis Katsaros, Constantine Lazos
HPDC
1999
IEEE
14 years 1 months ago
Process Hijacking
Process checkpointing is a basic mechanism required for providing High Throughput Computing service on distributively owned resources. We present a new process checkpoint and migr...
Victor C. Zandy, Barton P. Miller, Miron Livny