Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

85

CLUSTER
2005
IEEE

favoriteEmaildiscussreport

128views Distributed And Parallel Com...» more CLUSTER 2005»

Minimizing the Network Overhead of Checkpointing in Cycle-harvesting Cluster Environments

15 years 6 months ago

Minimizing the Network Overhead of Checkpointing in Cycle-harvesting Cluster Environments

Download pompone.cs.ucsb.edu

Cycle-harvesting systems such as Condor have been developed to make desktop machines in a local area (which are often similar to clusters in hardware conﬁguration) available as a compute platform. To provide a dual-use capability, opportunistic jobs harvesting cycles from the desktop must be checkpointed before the desktop resources are reclaimed by their owners and the job is evacuated. In this paper, we investigate a new system for computing efﬁcient checkpoint schedules in cycleharvesting environments. Our system records the historical availability from each resource and ﬁts a statistical model to the observations. Because checkpointing must often traverse the network (i.e. the desktop hosts do not provide sufﬁcient persistent storage for checkpoints), we combine this model with predictions of network performance to the storage site to compute a checkpoint schedule. When an application is initiated on a particular resource, the system uses the computed distribution to param...

Daniel Nurmi, John Brevik, Richard Wolski

Real-time Traffic

Checkpoint Schedule | CLUSTER 2005 | Cluster Computing | Efﬁcient Checkpoint Schedules | Network Overheads |

claim paper

Related Content

» A Scalable Asynchronous ReplicationBased Strategy for Fault Tolerant MPI Applications

» Coordinated Checkpoint versus Message Log for Fault Tolerant MPI

» Analysis of Clustering and Routing Overhead for Clustered Mobile Ad Hoc Networks

» Fast cluster failover using virtual memorymapped communication

» Coherencebased Coordinated Checkpointing for Software Distributed Shared Memory Systems

» Coordinated checkpoint from message payload in pessimistic senderbased message logging

» Optimized Distributed Data Sharing Substrate in Multicore Commodity Clusters A Comprehensi...

» Transparent Network Connectivity in Dynamic Cluster Environments

» Efficient Prioritized Service Recovery Using ContentAware Routing Mechanism in Web Server ...

Post Info
More Details (n/a)

Added	24 Jun 2010
Updated	24 Jun 2010
Type	Conference
Year	2005
Where	CLUSTER
Authors	Daniel Nurmi, John Brevik, Richard Wolski

Comments (0)