Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

188

CEAS
2006
Springer

155views Internet Technology» more CEAS 2006»

An Adaptive, Semi-Structured Language Model Approach to Spam Filtering on a New Corpus

15 years 10 months ago

An Adaptive, Semi-Structured Language Model Approach to Spam Filtering on a New Corpus

Download www.ceas.cc

Motivated by current efforts to construct more realistic spam filtering experimental corpora, we present a newly assembled, publicly available corpus of genuine and unsolicited (spam) email, dubbed GenSpam. We also propose an adaptive model for semi-structured document classification based on language model component interpolation. We compare this with a number of alternative classification models, and report promising results on the spam filtering task using a specifically assembled test set to be released as part of the GenSpam corpus.

Ben Medlock

Real-time Traffic

CEAS 2006 | Internet Technology | Semi-structured Document Classification | Spam Filtering | Spam Filtering Task |

claim paper

Related Content

» PSSF A Novel Statistical Approach for Personalized Serviceside Spam Filtering

» New filtering approaches for phishing email

» The Bias Problem and Language Models in Adaptive Filtering

» Automatic induction of language model data for a spoken dialogue system

» Unsupervised Language Model Adaptation Incorporating Named Entity Information

» Improved Phishing Detection using ModelBased Features

» Modeling and predicting personal information dissemination behavior

» Address standardization with latent semantic association

Post Info
More Details (n/a)

Added	20 Aug 2010
Updated	20 Aug 2010
Type	Conference
Year	2006
Where	CEAS
Authors	Ben Medlock

Comments (0)