Sciweavers

2677 search results - page 14 / 536
» Extracting Structured Data from Web Pages
Sort
View
ADBIS
1997
Springer
120views Database» more  ADBIS 1997»
14 years 22 days ago
Semistructured Data: The Tsimmis Experience
In this paper we discuss the management of semi-structured data, i.e., data that has irregular or dynamically changing structure. We describe components of the Stanford Tsimmis Pr...
Joachim Hammer, Jason McHugh, Hector Garcia-Molina
VLDB
2001
ACM
144views Database» more  VLDB 2001»
14 years 1 months ago
RoadRunner: Towards Automatic Data Extraction from Large Web Sites
The paper investigates techniques for extracting data from HTML sites through the use of automatically generated wrappers. To automate the wrapper generation and the data extracti...
Valter Crescenzi, Giansalvatore Mecca, Paolo Meria...
ER
2001
Springer
148views Database» more  ER 2001»
14 years 1 months ago
On the Automatic Extraction of Data from the Hidden Web
An increasing amount of Web data is accessible only by filling out HTML forms to query an underlying data source. While this is most welcome from a user perspective (queries are e...
Stephen W. Liddle, Sai Ho Yau, David W. Embley
WWW
2010
ACM
13 years 8 months ago
Exploiting content redundancy for web information extraction
We propose a novel extraction approach that exploits content redundancy on the web to extract structured data from template-based web sites. We start by populating a seed database...
Pankaj Gulhane, Rajeev Rastogi, Srinivasan H. Seng...
KDD
2007
ACM
189views Data Mining» more  KDD 2007»
14 years 9 months ago
Corroborate and learn facts from the web
The web contains lots of interesting factual information about entities, such as celebrities, movies or products. This paper describes a robust bootstrapping approach to corrobora...
Shubin Zhao, Jonathan Betz