Sciweavers

ICML
2002
IEEE

Non-Disjoint Discretization for Naive-Bayes Classifiers

15 years 7 days ago
Non-Disjoint Discretization for Naive-Bayes Classifiers
Previous discretization techniques have discretized numeric attributes into disjoint intervals. We argue that this is neither necessary nor appropriate for naive-Bayes classifiers. The analysis leads to a new discretization method, Non-Disjoint Discretization (NDD). NDD forms overlapping intervals for a numeric attribute, always locating a value toward the middle of an interval to obtain more reliable probability estimation. It also adjusts the number and size of discretized intervals to the number of training instances, seeking an appropriate trade-off between bias and variance of probability estimation. We justify NDD in theory and test it on a wide cross-section of datasets. Our experimental results suggest that for naiveBayes classifiers, NDD works better than alternative discretization approaches.
Ying Yang, Geoffrey I. Webb
Added 17 Nov 2009
Updated 17 Nov 2009
Type Conference
Year 2002
Where ICML
Authors Ying Yang, Geoffrey I. Webb
Comments (0)