Symbolic regression is a popular genetic programming (GP) application. Typically, the fitness function for this task is based on a sum-of-errors, involving the values of the dependent variable directly calculated from the candidate expression. While this approach is extremely successful in many instances, its performance can deteriorate in the presence of noise. In this paper, a feature-based fitness function is considered, in which the fitness scores are determined by comparing the statistical features of the sequence of values, rather than the actual values themselves. The set of features used in the fitness evaluation are customized according to the target, and are drawn from a wide set of features capable of characterizing a variety of behaviours. Experiments examining the performance of the feature-based and standard fitness functions are carried out for non-oscillating and oscillating targets in a GP system which introduces noise during the evaluation of candidate expressio...
Janine H. Imada, Brian J. Ross