Sciweavers

HASE
2007
IEEE

Scalable, Adaptive, Time-Bounded Node Failure Detection

14 years 6 months ago
Scalable, Adaptive, Time-Bounded Node Failure Detection
This paper presents a scalable, adaptive and timebounded general approach to assure reliable, real-time Node-Failure Detection (NFD) for large-scale, high load networks comprised of Commercial Off-The-Shelf (COTS) hardware and software. Nodes in the network are independent processors which may unpredictably fail either temporarily or permanently. We present a generalizable, multilayer, dynamically adaptive monitoring approach to NFD where a small, designated subset of the nodes are communicated information about node failures. This subset of nodes are notified of node failures in the network within an interval of time after the failures. Except under conditions of massive system failure, the NFD system has a zero false negative rate (failures are always detected with in a finite amount of time after failure) by design. The NFD system continually adjusts to decrease the false alarm rate as false alarms are detected. The NFD design utilizes nodes that transmit, within a given locality...
Matthew Gillen, Kurt Rohloff, Prakash Manghwani, R
Added 02 Jun 2010
Updated 02 Jun 2010
Type Conference
Year 2007
Where HASE
Authors Matthew Gillen, Kurt Rohloff, Prakash Manghwani, Richard E. Schantz
Comments (0)