— This paper proposes a novel method of fusing models for classification of unbalanced data. The unbalanced data contains a majority of healthy (negative) instances, and a minority of unhealthy (positive) instances. The applicability of this type of classification problem with security applications inspired the naming of such problems as security classification problems (SCP). The area under the ROC curve (AUC) is the metric utilized to measure classifier performance, and in order to better understand AUC and ROC behavior, pseudoROC curves created from simulated data are introduced. ROC curves depend entirely upon the rankings created by classifiers. The rank distributions discussed in this paper display classifier performance in a novel form, and the behavior of these rank distributions provides insight into classifier fusion for the SCP. Rank distributions, which illustrate the probability of a particular rank containing a positive or negative instance, will be introduced an...
Paul F. Evangelista, Mark J. Embrechts, Boleslaw K