Microreboot - A Technique for Cheap Recovery

16 years 7 months ago

Download www.usenix.org

A significant fraction of software failures in large-scale Internet systems are cured by rebooting, even when the exact failure causes are unknown. However, rebooting can be expensive, causing nontrivial service disruption or downtime even when clusters and failover are employed. In this work we use separation of process recovery from data recovery to enable microrebooting ? a fine-grain technique for surgically recovering faulty application components, without disturbing the rest of the application. We evaluate microrebooting in an Internet auction system running on an application server. Microreboots recover most of the same failures as full reboots, but do so an order of magnitude faster and result in an order of magnitude savings in lost work. This cheap form of recovery engenders a new approach to high availability: microreboots can be employed at the slightest hint of failure, prior to node failover in multi-node clusters, even when mistakes in failure detection are likely; fail...

George Candea, Shinichi Kawamoto, Yuichi Fujiki, G

Real-time Traffic

Faulty Application Components | Nontrivial Service Disruption | Operating System | OSDI 2004 | Transparent Call-level Retries |

claim paper

» Autonomous recovery in componentized Internet applications

» Exception Handling in the Choices Operating System

» WiFi position estimation in industrial environments using Gaussian processes

» A ProjectorCamera System with RealTime Photometric Adaptation for Dynamic Environments

Post Info
More Details (n/a)

Added	03 Dec 2009
Updated	03 Dec 2009
Type	Conference
Year	2004
Where	OSDI
Authors	George Candea, Shinichi Kawamoto, Yuichi Fujiki, Greg Friedman, Armando Fox

Comments (0)

Sciweavers

Microreboot - A Technique for Cheap Recovery

Faulty Application Components | Nontrivial Service Disruption | Operating System | OSDI 2004 | Transparent Call-level Retries |

Explore & Download

Productivity Tools

Sciweavers