Sciweavers

498 search results - page 32 / 100
» Robust web content extraction
Sort
View
ICDAR
2003
IEEE
14 years 4 months ago
Identifying Story and Preview Images in News Web Pages
The World Wide Web provides an increasingly powerful and popular publication mechanism. Web documents often contain a large number of images serving various different purposes. Th...
Jianying Hu, Amit Bagga
AIRWEB
2008
Springer
14 years 26 days ago
A few bad votes too many?: towards robust ranking in social media
Online social media draws heavily on active reader participation, such as voting or rating of news stories, articles, or responses to a question. This user feedback is invaluable ...
Jiang Bian, Yandong Liu, Eugene Agichtein, Hongyua...
WSDM
2010
ACM
204views Data Mining» more  WSDM 2010»
14 years 5 months ago
Learning URL patterns for webpage de-duplication
Presence of duplicate documents in the World Wide Web adversely affects crawling, indexing and relevance, which are the core building blocks of web search. In this paper, we pres...
Hema Swetha Koppula, Krishna P. Leela, Amit Agarwa...
CLEF
2008
Springer
14 years 19 days ago
Overview of VideoCLEF 2008: Automatic Generation of Topic-Based Feeds for Dual Language Audio-Visual Content
The VideoCLEF track, introduced in 2008, aims to develop and evaluate tasks related to analysis of and access to multilingual multimedia content. In its first year, VideoCLEF pilo...
Martha Larson, Eamonn Newman, Gareth J. F. Jones
COLING
1992
14 years 11 hour ago
Knowledge Extraction From Texts By Sintesi
In this paper we present SINTESI, a system for the knowledge extraction from Italian inputs, currently under development in our re,search centre. It is used on short descriptive d...
Fabio Ciravegna, Paolo Campia, Alberto Colognese