AdaBoost rarely suffers from overfitting problems in low noise data cases. However, recent studies with highly noisy patterns clearly showed that overfitting can occur. A natural strategy to alleviate the problem is to penalize the distribution skewness in the learning process to prevent several hardest examples from spoiling decision boundaries. In this paper, we describe in detail how a penalty scheme can be pursued in the mathematical programming setting as well as in the Boosting setting. By using two smooth convex penalty functions, two new soft margin concepts are defined and two new regularized AdaBoost algorithms are proposed. The effectiveness of the proposed algorithms is demonstrated through a large scale experiment. Compared with other regularized AdaBoost algorithms, our methods can achieve at least the same or much better performances.
Yijun Sun, Jian Li, William W. Hager