Support vector machines (SVMs) are regularly used for classification of unbalanced data by weighting more heavily the error contribution from the rare class. This heuristic technique is often used to learn classifiers with high F-measure, although this particular application of SVMs has not been rigorously examined. We provide significant and new theoretical results that support this popular heuristic. Specifically, we demonstrate that with the right parameter settings SVMs approximately optimize F-measure in the same way that SVMs have already been known to approximately optimize accuracy. This finding has a number of theoretical and practical implications for using SVMs in F-measure optimization.
David R. Musicant, Vipin Kumar, Aysel Ozgur