Sciweavers

GECCO
2005
Springer

A statistical learning theory approach of bloat

14 years 6 months ago
A statistical learning theory approach of bloat
Code bloat, the excessive increase of code size, is an important issue in Genetic Programming (GP). This paper proposes a theoretical analysis of code bloat in the framework of symbolic regression in GP, from the viewpoint of Statistical Learning Theory, a well grounded mathematical toolbox for Machine Learning. Two kinds of bloat must be distinguished in that context, depending whether the target function lies in the search space or not. Then, important mathematical results are proved using classical results from Statistical Learning. Namely, the Vapnik-Cervonenkis dimension of programs is computed, and further results from Statistical Learning allow to prove that a parsimonious fitness ensures Universal Consistency (the solution minimizing the empirical error does converge to the best possible error when the number of samples goes to infinity). However, it is proved that the standard method consisting in choosing a maximal program size depending on the number of samples might stil...
Sylvain Gelly, Olivier Teytaud, Nicolas Bredeche,
Added 27 Jun 2010
Updated 27 Jun 2010
Type Conference
Year 2005
Where GECCO
Authors Sylvain Gelly, Olivier Teytaud, Nicolas Bredeche, Marc Schoenauer
Comments (0)