Sciweavers

SIGIR
2008
ACM

Semi-supervised spam filtering: does it work?

13 years 11 months ago
Semi-supervised spam filtering: does it work?
The results of the 2006 ECML/PKDD Discovery Challenge suggest that semi-supervised learning methods work well for spam filtering when the source of available labeled examples differs from those to be classified. We have attempted to reproduce these results using data from the 2005 and 2007 TREC Spam Track, and have found the opposite effect: methods like self-training and transductive support vector machines yield inferior classifiers to those constructed using supervised learning on the labeled data alone. We investigate differences between the ECML/PKDD and TREC data sets and methodologies that may account for the opposite results. Categories and Subject Descriptors: H.3.3 Information Search and Retrieval: Information filtering General Terms: Experimentation, measurement.
Mona Mojdeh, Gordon V. Cormack
Added 15 Dec 2010
Updated 15 Dec 2010
Type Journal
Year 2008
Where SIGIR
Authors Mona Mojdeh, Gordon V. Cormack
Comments (0)