Semi-supervised spam filtering: does it work?

15 years 6 months ago

Download plg.uwaterloo.ca

The results of the 2006 ECML/PKDD Discovery Challenge suggest that semi-supervised learning methods work well for spam filtering when the source of available labeled examples differs from those to be classified. We have attempted to reproduce these results using data from the 2005 and 2007 TREC Spam Track, and have found the opposite effect: methods like self-training and transductive support vector machines yield inferior classifiers to those constructed using supervised learning on the labeled data alone. We investigate differences between the ECML/PKDD and TREC data sets and methodologies that may account for the opposite results. Categories and Subject Descriptors: H.3.3 Information Search and Retrieval: Information filtering General Terms: Experimentation, measurement.

Mona Mojdeh, Gordon V. Cormack

Real-time Traffic

Available Labeled Examples | Information Technology | SIGIR 2008 | Transductive Support Vector | TREC Spam Track |

claim paper

Added	15 Dec 2010
Updated	15 Dec 2010
Type	Journal
Year	2008
Where	SIGIR
Authors	Mona Mojdeh, Gordon V. Cormack

Sciweavers

Semi-supervised spam filtering: does it work?

Available Labeled Examples | Information Technology | SIGIR 2008 | Transductive Support Vector | TREC Spam Track |

Explore & Download

Productivity Tools

Sciweavers