Fast and transparent recovery for continuous availability of cluster-based servers

14 years 6 months ago

Download www.ics.forth.gr

Recently there has been renewed interest in building reliable servers that support continuous application operation. Besides maintaining system state consistent after a failure, one of the main challenges in achieving continuous operation is to provide fast reconﬁguration. The complexity of the failure reconﬁguration mechanisms employed and their overheads depend on the type of platform that is being used as a server and the types of applications that need to be supported. In this paper we focus on providing support for shared-memory applications running on clusters of commodity nodes and interconnects. Achieving continuous operation for shared memory applications on clusters presents two main challenges. (a) The fault tolerance mechanisms employed should be transparent to applications and should have low overhead during failure-free execution. (b) When failures occur, reconﬁguration should occur with minimum application disruption without requiring the full recovery of the fail...

Rosalia Christodoulopoulou, Kaloian Manassiev, Ang

Real-time Traffic

Continuous Operation | Distributed And Parallel Computing | Failure Reconﬁguration | PPOPP 2006 | Shared Memory Applications |

claim paper

Post Info
More Details (n/a)

Added	14 Jun 2010
Updated	14 Jun 2010
Type	Conference
Year	2006
Where	PPOPP
Authors	Rosalia Christodoulopoulou, Kaloian Manassiev, Angelos Bilas, Cristiana Amza

Comments (0)

Sciweavers

Fast and transparent recovery for continuous availability of cluster-based servers

Continuous Operation | Distributed And Parallel Computing | Failure Reconﬁguration | PPOPP 2006 | Shared Memory Applications |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers