Estimation of Distribution Algorithms (EDAs) are a class of evolutionary algorithms that use machine learning techniques to solve optimization problems. Machine learning is used to learn probabilistic models of the selected population. This model is then used to generate next population via sampling. An important phenomenon in machine learning from data is called overfitting. This occurs when the model is overly adapted to the specifics of the training data so well that even noise is encoded. The purpose of this paper is to investigate whether overfitting happens in EDAs, and to discover its consequences. What is found is: overfitting does occur in EDAs; overfitting correlates to EDAs performance; reduction of overfitting using early stopping can improve EDAs performance. Categories and Subject Descriptors I.2.8 [Artificial Intelligence]: Problem Solving, Control Methods, and Search General Terms Algorithms Keywords Overfitting, Estimation of Distribution Algorithms, Bayesian Optimiza...
Hao Wu, Jonathan L. Shapiro