Abstract. This paper explores distance measures based on genetic operators for genetic programming using tree structures. The consistency between genetic operators and distance measures is a crucial point for analytical measures of problem difficulty, such as fitness distance correlation, and for measures of population diversity, such as entropy or variance. The contribution of this paper is the exploration of possible definitions and approximations of operator-based edit distance measures. In particular, we focus on the subtree crossover operator. An empirical study is presented to illustrate the features of an operator-based distance. This paper makes progress toward improved algorithmic analysis by using appropriate measures of distance and similarity.
Steven M. Gustafson, Leonardo Vanneschi