Recovery Schemes for High Availability and High Performance Distributed Real-Time Computing

14 years 9 months ago

Download www.ipd.bth.se

Clusters and distributed systems offer fault tolerance and high performance through load sharing, and are thus attractive in real-time applications. When all computers are up and running, we would like the load to be evenly distributed among the computers. When one or more computers fail the must be redistributed. The redistribution is determined by the recovery scheme. The recovery scheme should keep the load as evenly distributed as possible even when the most unfavorable combinations of computers break down, i.e. we want to optimize the worst-case behavior. In this paper we define recovery schemes, which are optimal for a number of important cases. We also show that the problem of finding optimal recovery schemes corresponds to the mathematical problem of finding sequences of integers with minimal sum and for which all sums of subsequences are unique.

Lars Lundberg, Daniel Häggander, Kamilla Klon

Real-time Traffic

Computers | Distributed And Parallel Computing | IPPS 2003 | Recovery Schemes | Systems Offer Fault |

claim paper

Post Info
More Details (n/a)

Added	04 Jul 2010
Updated	04 Jul 2010
Type	Conference
Year	2003
Where	IPPS
Authors	Lars Lundberg, Daniel Häggander, Kamilla Klonowska, Charlie Svahnberg

Comments (0)

Sciweavers

Recovery Schemes for High Availability and High Performance Distributed Real-Time Computing

Computers | Distributed And Parallel Computing | IPPS 2003 | Recovery Schemes | Systems Offer Fault |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers