Sciweavers

910 search results - page 27 / 182
» Testbed for information extraction from deep web
Sort
View
WWW
2007
ACM
14 years 8 months ago
EPCI: extracting potentially copyright infringement texts from the web
In this paper, we propose a new system extracting potentially copyright infringement texts from the Web, called EPCI. EPCI extracts them in the following way: (1) generating a set...
Takashi Tashiro, Takanori Ueda, Taisuke Hori, Yu H...
WWW
2007
ACM
14 years 8 months ago
Adaptive record extraction from web pages
We describe an adaptive method for extracting records from web pages. Our algorithm combines a weighted tree matching metric with clustering for obtaining data extraction patterns...
Justin Park, Denilson Barbosa
WISE
2005
Springer
14 years 1 months ago
Constructing Interface Schemas for Search Interfaces of Web Databases
Many databases have become Web-accessible through form-based search interfaces (i.e., search forms) that allow users to specify complex and precise queries to access the underlying...
Hai He, Weiyi Meng, Clement T. Yu, Zonghuan Wu
PKDD
2007
Springer
143views Data Mining» more  PKDD 2007»
14 years 2 months ago
Using the Web to Reduce Data Sparseness in Pattern-Based Information Extraction
Textual patterns have been used effectively to extract information from large text collections. However they rely heavily on textual redundancy in the sense that facts have to be m...
Sebastian Blohm, Philipp Cimiano
AI
2005
Springer
13 years 7 months ago
Unsupervised named-entity extraction from the Web: An experimental study
The KNOWITALL system aims to automate the tedious process of extracting large collections of facts (e.g., names of scientists or politicians) from the Web in an unsupervised, doma...
Oren Etzioni, Michael J. Cafarella, Doug Downey, A...