Quantization of Continuous Input Variables for Binary Classification

14 years 4 months ago

Download www.cis.hut.fi

Quantization of continuous variables is important in data analysis, especially for some model classes such as Bayesian networks and decision trees, which use discrete variables. Often, the discretization is based on the distribution of the input variables only whereas additional information, for example in form of class membership is frequently present and could be used to improve the quality of the results. In this paper, quantization methods based on equal width interval, maximum entropy, maximum mutual information and the novel approach based on maximum mutual information combined with entropy are considered. The two former approaches do not take the class membership into account whereas the two latter approaches do. The relative merits of each method are compared in an empirical setting, where results are shown for two data sets in a direct marketing problem, and the quality of quantization is measured by mutual information and the performance of Naive Bayes and C5 decision tree cl...

Michal Skubacz, Jaakko Hollmén

Real-time Traffic

Decision Tree | IDEAL 2000 | Intelligent Agents | Maximum Mutual Information | Mutual Information |

claim paper

Post Info
More Details (n/a)

Added	25 Aug 2010
Updated	25 Aug 2010
Type	Conference
Year	2000
Where	IDEAL
Authors	Michal Skubacz, Jaakko Hollmén

Comments (0)

Sciweavers

Quantization of Continuous Input Variables for Binary Classification

Decision Tree | IDEAL 2000 | Intelligent Agents | Maximum Mutual Information | Mutual Information |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers