This paper investigates the performance of machine learning methods for classifying rock types from hyperspectral data. The main objective is to test the impact on classification error rate of calibrating the model's output into class probability estimates. The base classifiers included in this study are: boosted decision trees, support vector machines and logistic regression. The standard algorithm for some of these methods provides a non-probabilistic, hard decision as output. For those methods, posterior class probability estimates were approximated by fitting a sigmoid function to the classifier predictions. To perform multi-class classification, a oneversus-all approach was used. The different methods were compared using hyperspectral data acquired from ore-bearing rocks under different environmental conditions. The calibration of class probabilities improved the overall performance for almost all algorithms tested; an improvement of over 10% was observed in some cases.
Sildomar T. Monteiro, Richard J. Murphy