Sciweavers

IPPS
2007
IEEE

Fast Failure Detection in a Process Group

14 years 5 months ago
Fast Failure Detection in a Process Group
Failure detectors represent a very important building block in distributed applications. The speed and the accuracy of the failure detectors is critical to the performance of the applications built on them. In a common implementation of failure detector based on heartbeats, there is a tradeoff between speed and accuracy so it is difficult to be both fast and accurate. Based on the observation that in many distributed applications, one process takes a special role as the leader, we propose a Fast Failure Detection (FFD) algorithm that detects the failure of the leader both fast and accurately. Taking advantage of spatial multiple timeouts, FFD detects the failure of the leader within a time period of just a little more than one heartbeat interval, making it almost the fastest detection algorithm possible based on heartbeat messages. FFD could be used stand alone in a static configuration where the leader process is fixed at one site. In a dynamic setting, where the role of leader ha...
Xinjie Li, Monica Brockmeyer
Added 03 Jun 2010
Updated 03 Jun 2010
Type Conference
Year 2007
Where IPPS
Authors Xinjie Li, Monica Brockmeyer
Comments (0)