Filtering Email Spam in the Presence of Noisy User Feedback

14 years 5 months ago

Download www.eecs.tufts.edu

Recent email spam filtering evaluations, such as those conducted at TREC, have shown that near-perfect filtering results are attained with a variety of machine learning methods when filters are given perfectly accurate labeling feedback for training. Yet in realworld settings, labeling feedback may be far from perfect. Real users give feedback that is often mistaken, inconsistent, or even maliciously inaccurate. To our knowledge, the impact of this noisy labeling feedback on current spam filtering methods has not been previously explored in the literature. In this paper, we show that noisy feedback may harm or even break state-of-the-art spam filters, including recent TREC winners. We then propose and evaluate several approaches to make such filters robust to label noise. We find that although such modifications are effective for uniform random label noise, more realistic "natural" label noise from human users remains a difficult challenge.

D. Sculley, Gordon V. Cormack

Real-time Traffic

CEAS 2008 | Internet Technology | Label Noise | Labeling Feedback | Spam Filtering |

claim paper

Post Info
More Details (n/a)

Added	12 Oct 2010
Updated	12 Oct 2010
Type	Conference
Year	2008
Where	CEAS
Authors	D. Sculley, Gordon V. Cormack

Comments (0)

Sciweavers

Filtering Email Spam in the Presence of Noisy User Feedback

CEAS 2008 | Internet Technology | Label Noise | Labeling Feedback | Spam Filtering |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers