Sciweavers

96 search results - page 4 / 20
» Detecting Near-replicas on the Web by Content and Hyperlink ...
Sort
View
WWW
2009
ACM
14 years 8 months ago
A densitometric analysis of web template content
What makes template content in the Web so special that we need to remove it? In this paper I present a large-scale aggregate analysis of textual Web content, corroborating statist...
Christian Kohlschütter
WAIM
2010
Springer
13 years 6 months ago
Detecting Comment Spam through Content Analysis
In the Web 2.0 eras, the individual Internet users can also act as information providers, releasing information or making comments conveniently. However, some participants may spre...
Congrui Huang, Qiancheng Jiang, Yan Zhang
ERCIMDL
2000
Springer
246views Education» more  ERCIMDL 2000»
13 years 11 months ago
Automatic Web Rating: Filtering Obscene Content on the Web
We present a method to detect automatically pornographic content on the Web. Our method combines techniques from language engineering and image analysis within a machine-learning f...
Konstantinos Chandrinos, Ion Androutsopoulos, Geor...
AMR
2007
Springer
159views Multimedia» more  AMR 2007»
14 years 1 months ago
Automatically Detecting Members and Instrumentation of Music Bands Via Web Content Mining
Abstract. In this paper, we present an approach to automatically detecting music band members and instrumentation using web content mining techniques. To this end, we combine a nam...
Markus Schedl, Gerhard Widmer
WWW
2009
ACM
14 years 8 months ago
Towards language-independent web genre detection
The term web genre denotes the type of a given web resource, in contrast to the topic of its content. In this research, we focus on recognizing the web genres blog, wiki and forum...
Philipp Scholl, Renato Domínguez Garc&iacut...