Sciweavers

2137 search results - page 8 / 428
» Extraction of Structural Information from the Web
Sort
View
WEBI
2005
Springer
14 years 1 months ago
Automated Metadata and Instance Extraction from News Web Sites
In this paper, we present automated techniques for extracting metadata instance information by organizing and mining a set of news Web sites. We develop algorithms that detect and...
Srinivas Vadrevu, Saravanakumar Nagarajan, Fatih G...
WWW
2011
ACM
13 years 2 months ago
HyLiEn: a hybrid approach to general list extraction on the web
We consider the problem of automatically extracting general lists from the web. Existing approaches are mostly dependent upon either the underlying HTML markup or the visual struc...
Fabio Fumarola, Tim Weninger, Rick Barber, Donato ...
CLEF
2010
Springer
13 years 7 months ago
Person Attribute Extraction from the Textual Parts of Web Pages
We present the RGAI systems which participated in the third Web People Search Task challenge. The chief characteristics of our approach are that we focus on the raw textual parts o...
István Nagy, Richárd Farkas
WWW
2010
ACM
13 years 7 months ago
Exploiting content redundancy for web information extraction
We propose a novel extraction approach that exploits content redundancy on the web to extract structured data from template-based web sites. We start by populating a seed database...
Pankaj Gulhane, Rajeev Rastogi, Srinivasan H. Seng...
KDD
2008
ACM
153views Data Mining» more  KDD 2008»
14 years 8 months ago
Information extraction from Wikipedia: moving down the long tail
Not only is Wikipedia a comprehensive source of quality information, it has several kinds of internal structure (e.g., relational summaries known as infoboxes), which enable self-...
Fei Wu, Raphael Hoffmann, Daniel S. Weld