Using a specific machine learning technique, this paper proposes a way to identify suspicious statements during debugging. The technique is based on principles similar to Tarantula but addresses its main flaw: its difficulty to deal with the presence of multiple faults as it assumes that failing test cases execute the same fault(s). The improvement we present in this paper results from the use of C4.5 decision trees to identify various failure conditions based on information regarding the test cases' inputs and outputs. Failing test cases executing under similar conditions are then assumed to fail due to the same fault(s). Statements are then considered suspicious if they are covered by a large proportion of failing test cases that execute under similar conditions. We report on a case study that demonstrates improvement over the original Tarantula technique in terms of statement ranking. Another contribution of this paper is to show that failure conditions as modeled by a C4.5 de...
Lionel C. Briand, Yvan Labiche, Xuetao Liu