Sciweavers

1947 search results - page 53 / 390
» On the Automatic Extraction of Data from the Hidden Web
Sort
View
DL
2000
Springer
162views Digital Library» more  DL 2000»
14 years 1 months ago
Snowball: extracting relations from large plain-text collections
Text documents often contain valuable structured data that is hidden in regular English sentences. This data is best exploited if available as a relational table that we could use...
Eugene Agichtein, Luis Gravano
ICDE
2007
IEEE
173views Database» more  ICDE 2007»
14 years 10 months ago
Annotating Structured Data of the Deep Web
An increasing number of databases have become Web accessible through HTML form-based search interfaces. The data units returned from the underlying database are usually encoded in...
Yiyao Lu, Hai He, Hongkun Zhao, Weiyi Meng, Clemen...
COLING
2008
13 years 10 months ago
Homotopy-Based Semi-Supervised Hidden Markov Models for Sequence Labeling
This paper explores the use of the homotopy method for training a semi-supervised Hidden Markov Model (HMM) used for sequence labeling. We provide a novel polynomial-time algorith...
Gholamreza Haffari, Anoop Sarkar
LPNMR
2001
Springer
14 years 1 months ago
Declarative Information Extraction, Web Crawling, and Recursive Wrapping with Lixto
Lixto is a system and method for the visual and interactive generation of wrappers for Web pages under the supervision of a human developer, for automatically extracting informatio...
Robert Baumgartner, Sergio Flesca, Georg Gottlob
DILS
2009
Springer
14 years 3 months ago
Site-Wide Wrapper Induction for Life Science Deep Web Databases
We present a novel approach to automatic information extraction from Deep Web Life Science databases using wrapper induction. Traditional wrapper induction techniques focus on lear...
Saqib Mir, Steffen Staab, Isabel Rojas