In this paper we address the problem of how to learn a structural prototype that can be used to represent the variations present in a set of trees. The prototype serves as a pattern space representation for the set of trees. To do this we construct a super-tree to span the union of the set of trees. This is a chicken and egg problem, since before the structure can be estimated correspondences between the nodes of the super-tree and the nodes of the sample tree must be to hand. We demonstrate how to simultaneously estimate the structure of the super-tree and recover the required correspondences by minimizing the sum of the tree edit-distances over pairs of trees, subject to edge consistency constraints. Each node of the super-tree corresponds to a dimension of the pattern space, and for each tree we construct a pattern vector in which the elements of the weights corresponding to each of the dimensions of the super-tree. We perform pattern analysis on the set of trees by performing prin...
Andrea Torsello, Edwin R. Hancock