Software failures in wireless sensor systems are notoriously difficult to debug. Resource constraints in wireless deployments substantially restrict visibility into the root causes of node-level system and application faults. At the same time, the high costs of deployment of wireless sensor systems often far exceed the cumulative costs of all other sensor hardware, so that software failures that completely disable a node are prohibitively expensive to repair in real world applications, e.g. by on-site visits to replace or reset nodes. We describe NodeMD, a deployment management system that successfully implements lightweight run-time detection, logging, and notification of software faults on wireless mote-class devices. NodeMD introduces a debug mode that catches a failure before it completely disables a node and drops the node into a stable state that enables further diagnosis and correction, thus avoiding on-site redeployment. We analyze the performance of NodeMD on a real world app...