We propose a simple, novel and yet effective method for building and testing decision trees that minimizes the sum of the misclassification and test costs. More specifically, we first put forward an original and simple splitting criterion for attribute selection in tree building. Our treebuilding algorithm has many desirable properties for a cost-sensitive learning system that must account for both types of costs. Then, assuming that the test cases may have a large number of missing values, we design several intelligent test strategies that can suggest ways of obtaining the missing values at a cost in order to minimize the total cost. We experimentally compare these strategies and C4.5, and demonstrate that our new algorithms significantly outperform C4.5 and its variations. In addition, our algorithm's complexity is similar to that of C4.5, and is much lower than that of previous work. Our work is useful for many diagnostic tasks which must factor in the misclassification and te...
Charles X. Ling, Qiang Yang, Jianning Wang, Shicha