Online Active Learning Methods for Fast Label-Efficient Spam Filtering

14 years 4 months ago

Download www.eecs.tufts.edu

Active learning methods seek to reduce the number of labeled examples needed to train an effective classifier, and have natural appeal in spam filtering applications where trustworthy labels for messages may be costly to acquire. Past investigations of active learning in spam filtering have focused on the pool-based scenario, where there is assumed to be a large, unlabeled data set and the goal is to iteratively identify the best subset of examples for which to request labels. However, even with optimizations this is a costly approach. We investigate an online active learning scenario where the filter is exposed to a stream of messages which must be classified one at a time. The filter may only request a label for a given message immediately after it has been classified. The goal is to achieve strong online classification performance with few label requests. This is a novel scenario for low-cost active spam filtering, fitting for application in large-scale systems. We draw from the la...

D. Sculley

Real-time Traffic

Active Learning | CEAS 2007 | Internet Technology | Online Active Learning | Spam Filtering |

claim paper

Post Info
More Details (n/a)

Added	12 Aug 2010
Updated	12 Aug 2010
Type	Conference
Year	2007
Where	CEAS
Authors	D. Sculley

Comments (0)

Sciweavers

Online Active Learning Methods for Fast Label-Efficient Spam Filtering

Active Learning | CEAS 2007 | Internet Technology | Online Active Learning | Spam Filtering |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers