Sciweavers

KDD
2001
ACM

Data mining criteria for tree-based regression and classification

14 years 11 months ago
Data mining criteria for tree-based regression and classification
This paper is concerned with the construction of regression and classification trees that are more adapted to data mining applications than conventional trees. To this end, we propose new splitting criteria for growing trees. Conventional splitting criteria attempt to perform well on both sides of a split by attempting a compromise in the quality of fit between the left and the right side. By contrast, we adopt a data mining point of view by proposing criteria that search for interesting subsets of the data, as opposed to modeling all of the data equally well. The new criteria do not split based on a compromise between the left and the right bucket; they effectively pick the more interesting bucket and ignore the other. As expected, the result is often a simpler characterization of interesting subsets of the data. Less expected is that the new criteria often yield whole trees that provide more interpretable data descriptions. Surprisingly, it is a "flaw" that works to their ...
Andreas Buja, Yung-Seop Lee
Added 30 Nov 2009
Updated 30 Nov 2009
Type Conference
Year 2001
Where KDD
Authors Andreas Buja, Yung-Seop Lee
Comments (0)