Reproducing and learning from failures in deployed software is costly and difficult. Those activities can be facilitated, however, if the circumstances leading to a failure are properly captured. In this work, we empirically investigate how various anomaly detection schemes can serve to identify the conditions that precede failures in deployed software. Our results expose the tradeoffs between different detection algorithms applied to several types of events under varying levels of in-house testing.
Sebastian G. Elbaum, Satya Kanduri, Anneliese Amsc