Sciweavers

OSDI
2006
ACM

EXPLODE: A Lightweight, General System for Finding Serious Storage System Errors

14 years 11 months ago
EXPLODE: A Lightweight, General System for Finding Serious Storage System Errors
Storage systems such as file systems, databases, and RAID systems have a simple, basic contract: you give them data, they do not lose or corrupt it. Often they store the only copy, making its irrevocable loss almost arbitrarily bad. Unfortunately, their code is exceptionally hard to get right, since it must correctly recover from any crash at any program point, no matter how their state was smeared across volatile and persistent memory. This paper describes EXPLODE, a system that makes it easy to systematically check real storage systems for errors. It takes user-written, potentially system-specific checkers and uses them to drive a storage system into tricky corner cases, including crash recovery errors. EXPLODE uses a novel adaptation of ideas from model checking, a comprehensive, heavyweight formal verification technique, that makes its checking more systematic (and hopefully more effective) than a pure testing approach while being just as lightweight. EXPLODE is effective. It foun...
Junfeng Yang, Can Sar, Dawson R. Engler
Added 03 Dec 2009
Updated 03 Dec 2009
Type Conference
Year 2006
Where OSDI
Authors Junfeng Yang, Can Sar, Dawson R. Engler
Comments (0)