Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

216

AIRWEB
2008
Springer

126views Internet Technology» more AIRWEB 2008»

Web spam identification through content and hyperlinks

15 years 9 months ago

Web spam identification through content and hyperlinks

Download airweb.cse.lehigh.edu

We present an algorithm, witch, that learns to detect spam hosts or pages on the Web. Unlike most other approaches, it simultaneously exploits the structure of the Web graph as well as page contents and features. The method is efficient, scalable, and provides state-of-the-art accuracy on a standard Web spam benchmark. Categories and Subject Descriptors H.4.m [Information Systems Applications]: Miscellaneous; I.2.6 [Learning]; I.5 [Pattern Recognition] Keywords Web spam, graph regularization, Support Vector Machines

Jacob Abernethy, Olivier Chapelle, Carlos Castillo

Real-time Traffic

AIRWEB 2008 | Internet Technology | Keywords Web Spam | Web Spam | Web Spam Benchmark |

claim paper

Related Content

» Link Spam Detection based on DBSpamClust with Fuzzy Cmeans Clustering

» Detecting spam web pages through content analysis

» Improving web spam detection with reextracted features

» Detecting Comment Spam through Content Analysis

» Spam Damn Spam and Statistics Using Statistical Analysis to Locate Spam Web Pages

» Web spam identification through language model analysis

» Extracting Link Spam using Biased Random Walks from Spam Seed Sets

» Hubs authorities and communities

» WebGrid A New Paradigm for Web System

Post Info
More Details (n/a)

Added	12 Oct 2010
Updated	12 Oct 2010
Type	Conference
Year	2008
Where	AIRWEB
Authors	Jacob Abernethy, Olivier Chapelle, Carlos Castillo

Comments (0)