This paper presents a scalable, adaptive and timebounded general approach to assure reliable, real-time Node-Failure Detection (NFD) for large-scale, high load networks comprised of Commercial Off-The-Shelf (COTS) hardware and software. Nodes in the network are independent processors which may unpredictably fail either temporarily or permanently. We present a generalizable, multilayer, dynamically adaptive monitoring approach to NFD where a small, designated subset of the nodes are communicated information about node failures. This subset of nodes are notified of node failures in the network within an interval of time after the failures. Except under conditions of massive system failure, the NFD system has a zero false negative rate (failures are always detected with in a finite amount of time after failure) by design. The NFD system continually adjusts to decrease the false alarm rate as false alarms are detected. The NFD design utilizes nodes that transmit, within a given locality...