Abstract—Statistical fault localization techniques find suspicious faulty program entities in programs by comparing passed and failed executions. Existing studies show that such ...
Faults in an IP network have various causes such as the failure of one or more routers at the IP layer, fiber-cuts, failure of physical elements at the optical layer, or extraneo...
Srikanth Kandula, Dina Katabi, Jean-Philippe Vasse...
Designing a distributed fault tolerance algorithm requires careful analysis of both fault models and diagnosis strategies. A system will fail if there are too many active faults, ...
This paper addresses the run-time diagnosis of delay faults in functional units of microprocessors. Despite the popularity of the stuck-at fault model, it is no longer the only re...
Abstract--We present a fault tolerant task pool execution environment that is capable of performing fine-grain selective restart using a lightweight, distributed task completion tr...
James Dinan, Arjun Singri, P. Sadayappan, Sriram K...