Time-efficient spam e-mail filtering using n-gram models

15 years 6 months ago

Download www.cmpe.boun.edu.tr

In this paper, we propose spam e-mail filtering methods having high accuracies and low time complexities. The methods are based on the n-gram approach and a heuristics which is referred to as the first n-words heuristics. We develop two models, a class general model and an e-mail specific model, and test the methods under these models. The models are then combined in such a way that the latter one is activated for the cases the first model falls short. Though the approach proposed and the methods developed are general and can be applied to any language, we mainly apply them to Turkish, which is an agglutinative language, and examine some properties of the language. Extensive tests were performed and success rates about 98% for Turkish and 99% for English were obtained. It has been shown that the time complexities can be reduced significantly without sacrificing performance.

Ali Çiltik, Tunga Güngör

Real-time Traffic

E-mail Filtering Methods | Low Time Complexities | PRL 2008 | Time Complexities |

claim paper

Post Info
More Details (n/a)

Added	14 Dec 2010
Updated	14 Dec 2010
Type	Journal
Year	2008
Where	PRL
Authors	Ali Çiltik, Tunga Güngör

Comments (0)

Sciweavers

Time-efficient spam e-mail filtering using n-gram models

E-mail Filtering Methods | Low Time Complexities | PRL 2008 | Time Complexities |

Explore & Download

Productivity Tools

Sciweavers