Sciweavers

152 search results - page 10 / 31
» Redundancy-Driven Web Data Extraction and Integration
Sort
View
RIAO
1997
13 years 8 months ago
Coupling information retrieval and information extraction: A new text technology for gathering information from the web
The techniques of information retrieval and information extraction are complementary, but to date there has been little concrete work aimed at integrating the two. We describe how...
Robert J. Gaizauskas, Alexander M. Robertson
JAIR
2008
173views more  JAIR 2008»
13 years 7 months ago
Creating Relational Data from Unstructured and Ungrammatical Data Sources
In order for agents to act on behalf of users, they will have to retrieve and integrate vast amounts of textual data on the World Wide Web. However, much of the useful data on the...
Matthew Michelson, Craig A. Knoblock
ICDE
2008
IEEE
153views Database» more  ICDE 2008»
14 years 8 months ago
Automatically Extracting Form Labels
We describe a machine-learning-based approach for extracting attribute labels from Web form interfaces. Having these labels is a requirement for several techniques that attempt to ...
Hoa Nguyen, Eun Yong Kang, Juliana Freire
ADC
2006
Springer
130views Database» more  ADC 2006»
14 years 1 months ago
A two-phase rule generation and optimization approach for wrapper generation
Web information extraction is a fundamental issue for web information management and integrations. A common approach is to use wrappers to extract data from web pages or documents...
Yanan Hao, Yanchun Zhang
W3C
1998
13 years 8 months ago
A Query Language for XML
An important application of XML is the interchange of electronic data (EDI) between multiple data sources on the Web. As XML data proliferates on the Web, applications will need t...
Mary F. Fernandez, Dan Suciu