In this paper, we address the issue of evaluating decision trees generated from training examples by a learning algorithm. We give a set of performance measures and show how some of them relate to others. We derive results suggesting that the number of leaves in a decision tree is the important measure to minimize. Minimizing this measure will, in a probabilistic sense, improve performance along the other measures. Notably it is expected to produce trees whose error rates are less likely to exceed some acceptable limit. The motivation for deriving such results is two-fold:
Usama M. Fayyad, Keki B. Irani