Abstract. The aim of this work is to forecast future events in financial data sets, in particular, we focus our attention on situations where positive instances are rare, which falls into the domain of Chance Discovery. Machine learning classifiers extend the past experiences into the future. However the imbalance between positive and negative cases poses a serious challenge to machine learning techniques. Because it favours negative classifications, which has a high chance of being correct due to the nature of the data. Genetic Algorithms have the ability to create multiple solutions for a single problem. To exploit this feature we propose to analyse the decision trees created by Genetic Programming. The objective is to extract and collect different rules that classify the positive cases. It lets model the rare cases in different ways, increasing the possibility of identifying similar cases in the future. The over-learning produced by this method attempts to compensate the lack of pos...
Alma Lilia Garcia-Almanza, Edward P. K. Tsang