Tackling the Poor Assumptions of Naive Bayes Text Classifiers

16 years 7 months ago

Download people.csail.mit.edu

Naive Bayes is often used as a baseline in text classification because it is fast and easy to implement. Its severe assumptions make such efficiency possible but also adversely affect the quality of its results. In this paper we propose simple, heuristic solutions to some of the problems with Naive Bayes classifiers, addressing both systemic issues as well as problems that arise because text is not actually generated according to a multinomial model. We find that our simple corrections result in a fast algorithm that is competitive with stateof-the-art text classification algorithms such as the Support Vector Machine.

Jason D. Rennie, Lawrence Shih, Jaime Teevan, Davi

Real-time Traffic

ICML 2003 | Machine Learning | Naive Bayes Classifiers | Support Vector Machine | Text Classification Algorithms |

claim paper

» Learning Bayesian Network Classifiers for Facial Expression Recognition using both Labeled...

» Learning to Classify Text from Labeled and Unlabeled Documents

» Recognizing EndUser Transactions in Performance Management

Post Info
More Details (n/a)

Added	17 Nov 2009
Updated	17 Nov 2009
Type	Conference
Year	2003
Where	ICML
Authors	Jason D. Rennie, Lawrence Shih, Jaime Teevan, David R. Karger

Comments (0)

Sciweavers

Tackling the Poor Assumptions of Naive Bayes Text Classifiers

ICML 2003 | Machine Learning | Naive Bayes Classifiers | Support Vector Machine | Text Classification Algorithms |

Explore & Download

Productivity Tools

Sciweavers