The class imbalance problem (when one of the classes has much less samples than the others) is of great importance in machine learning, because it corresponds to many critical applications. In this work we introduce the Recursive Partitioning of the Majority Class (REPMAC) algorithm, a new hybrid method to solve imbalanced problems. Using a clustering method, REPMAC recursively splits the majority class in several subsets, creating a decision tree, until the resulting sub-problems are balanced or easy to solve. At that point, a classifier is fitted to each subproblem. We evaluate the new method on 7 datasets from the UCI repository, finding that REPMAC is more efficient than other methods usually applied to imbalanced datasets.
Hernán Ahumada, Guillermo L. Grinblat, Luca