The use of the statistical technique of mixture model analysis as a tool for early prediction of fault-prone program modules is investigated. The Expectation-Maximum likelihood (EM) algorithm is engaged to build the model. By only employing software size and complexity metrics, this technique can be used to develop a model for predicting software quality even without the prior knowledge of the number of faults in the modules. In addition, Akaike Information Criterion (AIC) is used to select the model number, which is assumed to be the class number the program modules should be classified. The technique is successful in classifying software into fault-prone and non fault-prone modules with a relatively low error rate, providing a reliable indicator for software quality prediction.
Ping Guo, Michael R. Lyu