We propose a low-overhead sampling infrastructure for gathering information from the executions experienced by a program’s user community. Several example applications illustrate ways to use sampled instrumentation to isolate bugs. Assertion-dense code can be transformed to share the cost of assertions among many users. Lacking assertions, broad guesses can be made about predicates that predict program errors and a process of elimination used to whittle these down to the true bug. Finally, even for non-deterministic bugs such as memory corruption, statistical modeling based on logistic regression allows us to identify program behaviors that are strongly correlated with failure and are therefore likely places to look for the error. Categories and Subject Descriptors D.2.5 [Software Engineering]: Testing and Debugging— distributed debugging; G.3 [Mathematics of Computing]: Probability and Statistics—correlation and regression analysis; I.5.2 [Pattern Recognition]: Design Methodolo...
Ben Liblit, Alexander Aiken, Alice X. Zheng, Micha