Automated static analysis can identify potential source code anomalies early in the software process that could lead to field failures. However, only a small portion of static analysis alerts may be important to the developer (actionable). The remainder are false positives (unactionable). We propose a process for building false positive mitigation models to classify static analysis alerts as actionable or unactionable using machine learning techniques. For two open source projects, we identify sets of alert characteristics predictive of actionable and unactionable alerts out of 51 candidate characteristics. From these selected characteristics, we evaluate 15 machine learning algorithms, which build models to classify alerts. We were able to obtain 88-97% average accuracy for both projects in classifying alerts using three to 14 alert characteristics. Additionally, the set of selected alert characteristics and best models differed between the two projects, suggesting that false positiv...
Sarah Smith Heckman, Laurie A. Williams