Sciweavers

CEAS
2005
Springer

Naive Bayes Spam Filtering Using Word-Position-Based Attributes

14 years 6 months ago
Naive Bayes Spam Filtering Using Word-Position-Based Attributes
This paper explores the use of the naive Bayes classifier as the basis for personalised spam filters. Several machine learning algorithms, including variants of naive Bayes, have previously been used for this purpose, but the author’s implementation using wordposition-based attribute vectors gave very good results when tested on several publicly available corpora. The effects of various forms of attribute selection—removal of frequent and infrequent words, respectively, and by using mutual information—are investigated. It is also shown how n-grams, with n > 1, may be used to boost classification performance. Finally, an efficient weighting scheme for cost-sensitive classification is introduced.
Johan Hovold
Added 26 Jun 2010
Updated 26 Jun 2010
Type Conference
Year 2005
Where CEAS
Authors Johan Hovold
Comments (0)