Sciweavers

OSDI
2000
ACM

Exploring Failure Transparency and the Limits of Generic Recovery

14 years 2 months ago
Exploring Failure Transparency and the Limits of Generic Recovery
: We explore the abstraction of failure transparency in which the operating system provides the illusion of failure-free operation. To provide failure transparency, an operating system must recover applications after hardware, operating system, and application failures, and must do so without help from the programmer or unduly slowing failure-free performance. We describe two invariants that must be upheld to provide failure transparency: one that ensures sufficient application state is saved to guarantee the user cannot discern failures, and another that ensures sufficient application state is lost to allow recovery from failures affecting application state. We find that several real applications get failure transparency in the presence of simple stop failures with overhead of 0-12%. Less encouragingly, we find that applications violate one invariant in the course of upholding the other for more than 90% of application faults and 3-15% of operating system faults, rendering transparent...
David E. Lowell, Subhachandra Chandra, Peter M. Ch
Added 01 Nov 2010
Updated 01 Nov 2010
Type Conference
Year 2000
Where OSDI
Authors David E. Lowell, Subhachandra Chandra, Peter M. Chen
Comments (0)