A new framework is proposed for comparing evaluation metrics in classification applications with imbalanced datasets (i.e., the probability of one class vastly exceeds others). For model selection as well as testing the performance of a classifier, this framework finds the most suitable evaluation metric amongst a number of metrics. We apply this framework to compare two metrics: overall accuracy and Kappa coefficient. Simulation results demonstrate that Kappa coefficient is more suitable.
Mehrdad Fatourechi, Rabab K. Ward, Steven G. Mason