This paper discusses the underlying pressures responsible for code growth in genetic programming, and shows how an understanding of these pressures can be used to use to eliminate code growth while simultaneously improving performance. We begin with a discussion of two distinct components of code growth and the extent to which each component is relevant in practice. We then define the concept of resilience in GP trees, and show that the buildup of resilience is essential for code growth. We present simple modifications to the selection procedures used by GP that eliminate bloat without hurting performance. Finally, we show that eliminating bloat can improve the performance of genetic programming by a factor that increases as the problem is scaled in difficulty.
Matthew J. Streeter