Spam detection using web page content: a new battleground

13 years 2 months ago

Download homepages.dcc.ufmg.br

Traditional content-based e-mail spam ﬁltering takes into account content of e-mail messages and apply machine learning techniques to infer patterns that discriminate spams from hams. In particular, the use of content-based spam ﬁltering unleashed an unending arms race between spammers and ﬁlter developers, given the spammers’ ability to continuously change spam message content in ways that might circumvent the current ﬁlters. In this paper, we propose to expand the horizons of content-based ﬁlters by taking into consideration the content of the Web pages linked by e-mail messages. We describe a methodology for extracting pages linked by URLs in spam messages and we characterize the relationship between those pages and the messages. We then use a machine learning technique (a lazy associative classiﬁer) to extract classiﬁcation rules from the web pages that are relevant to spam detection. We demonstrate that the use of information from linked pages can nicely complemen...

Marco Túlio Ribeiro, Pedro Henrique Calais

Real-time Traffic

CEAS 2011 | Internet Technology | Mail Messages | Mail Spam | Spam Messages |

claim paper

Post Info
More Details (n/a)

Added	13 Dec 2011
Updated	13 Dec 2011
Type	Journal
Year	2011
Where	CEAS
Authors	Marco Túlio Ribeiro, Pedro Henrique Calais Guerra, Leonardo Vilela, Adriano Veloso, Dorgival Guedes, Wagner Meira Jr., Marcelo H. P. C. Chaves, Klaus Steding-Jessen, Cristine Hoepers

Comments (0)

Sciweavers

Spam detection using web page content: a new battleground

CEAS 2011 | Internet Technology | Mail Messages | Mail Spam | Spam Messages |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers