Sciweavers

92 search results - page 12 / 19
» HTML Pattern Generator--Automatic Data Extraction from Web P...
Sort
View
WWW
2005
ACM
14 years 9 months ago
Web data extraction based on partial tree alignment
This paper studies the problem of extracting data from a Web page that contains several structured data records. The objective is to segment these data records, extract data items...
Yanhong Zhai, Bing Liu
DEBU
1999
109views more  DEBU 1999»
13 years 8 months ago
Data Management for XML: Research Directions
This paper is a July 1999 snapshot of a "whitepaper" that I've been working on. The purpose of the whitepaper, which I initially drafted in April 1999, was to formu...
Jennifer Widom
ICDE
2007
IEEE
173views Database» more  ICDE 2007»
14 years 10 months ago
Annotating Structured Data of the Deep Web
An increasing number of databases have become Web accessible through HTML form-based search interfaces. The data units returned from the underlying database are usually encoded in...
Yiyao Lu, Hai He, Hongkun Zhao, Weiyi Meng, Clemen...
WWW
2001
ACM
14 years 9 months ago
IEPAD: information extraction based on pattern discovery
The research in information extraction (IE) regards the generation of wrappers that can extract particular information from semistructured Web documents. Similar to compiler gener...
Chia-Hui Chang, Shao-Chen Lui
VLDB
2011
ACM
251views Database» more  VLDB 2011»
13 years 3 months ago
Harvesting relational tables from lists on the web
A large number of web pages contain data structured in the form of “lists”. Many such lists can be further split into multi-column tables, which can then be used in more seman...
Hazem Elmeleegy, Jayant Madhavan, Alon Y. Halevy