Sciweavers

232 search results - page 5 / 47
» Query-related data extraction of hidden web documents
Sort
View
FCT
2001
Springer
14 years 3 days ago
Polynomial Time Algorithms for Finding Unordered Tree Patterns with Internal Variables
Many documents such as Web documents or XML files have tree structures. A term tree is an unordered tree pattern consisting of internal variables and tree structures. In order to ...
Takayoshi Shoudai, Tomoyuki Uchida, Tetsuhiro Miya...
SIGMOD
2008
ACM
159views Database» more  SIGMOD 2008»
14 years 7 months ago
Web-scale extraction of structured data
A long-standing goal of Web research has been to construct a unified Web knowledge base. Information extraction techniques have shown good results on Web inputs, but even most dom...
Michael J. Cafarella, Jayant Madhavan, Alon Y. Hal...
BNCOD
2006
88views Database» more  BNCOD 2006»
13 years 9 months ago
The Lixto Project: Exploring New Frontiers of Web Data Extraction
The Lixto project is an ongoing research effort in the area of Web data extraction. Whereas the project originally started out with the idea to develop a logic-based extraction lan...
Julien Carme, Michal Ceresna, Oliver Frölich,...
WWW
2009
ACM
14 years 8 months ago
Sitemaps: above and beyond the crawl of duty
Comprehensive coverage of the public web is crucial to web search engines. Search engines use crawlers to retrieve pages and then discover new ones by extracting the pages' o...
Uri Schonfeld, Narayanan Shivakumar
CIKM
1998
Springer
13 years 12 months ago
Ontology-Based Extraction and Structuring of Information from Data-Rich Unstructured Documents
We present a new approach to extracting information from unstructured documents based on an application ontology that describes a domain of interest. Starting with such an ontolog...
David W. Embley, Douglas M. Campbell, Randy D. Smi...