We describe an approach to quantitatively evaluating human-assisted failure-recovery tools and processes in the environment of modern Internet- and enterprise-class server systems. Our approach can quantify the dependability impact of a single recovery system, and also enables comparisons between different recovery approaches. The approach combines aspects of dependability benchmarking with human user studies, incorporating human participants in the system evaluations yet still producing typical dependability-related metrics as results. We illustrate our methodology via a case study of a system-wide undo/redo recovery tool for e-mail services; our approach is able to expose the dependability benefits of the tool as well as point out areas where its behavior could use improvement.
Aaron B. Brown, Leonard Chung, William Kakes, Calv