Failure-Aware Construction and Reconfiguration of Distributed Virtual Machines for High Availability Computing

14 years 4 months ago

Download www.cse.unt.edu

In large-scale clusters and computational grids, component failures become norms instead of exceptions. Failure occurrence as well as its impact on system performance and operation costs have become an increasingly important concern to system designers and administrators. In this paper, we study how to efficiently utilize system resources for high-availability clusters with the support of the virtual machine (VM) technology. We design a reconfigurable distributed virtual machine (RDVM) infrastructure for clusters computing. We propose failure-aware node selection strategies for the construction and reconfiguration of RDVMs. We leverage the proactive failure management techniques in calculating nodes' reliability status. We consider both the performance and reliability status of compute nodes in making selection decisions. We define a capacity-reliability metric to combine the effects of both factors in node selection, and propose Best-fit algorithms to find the best qualified nod...

Song Fu

Real-time Traffic

CCGRID 2009 | Distributed And Parallel Computing | Node Selection | Reliability Status | Virtual Machine |

claim paper

Post Info
More Details (n/a)

Added	17 Aug 2010
Updated	17 Aug 2010
Type	Conference
Year	2009
Where	CCGRID
Authors	Song Fu

Comments (0)

Sciweavers

Failure-Aware Construction and Reconfiguration of Distributed Virtual Machines for High Availability Computing

CCGRID 2009 | Distributed And Parallel Computing | Node Selection | Reliability Status | Virtual Machine |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers