When choosing a testing technique, practitioners want to know which one will detect the faults that matter most to them in the programs that they plan to test. Do empirical evaluations of testing techniques provide this information? More often than not, they report how many faults in a carefully chosen “representative” sample the evaluated techniques detect. But the population of faults that such a sample would represent depends heavily on the faults’ context or environment—as does the cost of failing to detect those faults. If empirical studies are to provide information that a practitioner can apply outside the context of the study, they must characterize the faults studied in a way that translates across contexts. A testing technique’s faultdetecting abilities could then be interpreted relative to the fault characterization. In this paper, we present a list of criteria that a fault characterization must meet in order to be fit for this task, and we evaluate several well-...
Jaymie Strecker, Atif M. Memon