Search Sciweavers | Sciweavers

232 search results - page 8 / 47

» Query-related data extraction of hidden web documents

110

click to vote

ER
2007
Springer

142views Database» more ER 2007»

Automatic Hidden-Web Table Interpretation by Sibling Page Comparison

15 years 9 months ago

Download www.deg.byu.edu

The longstanding problem of automatic table interpretation still illudes us. Its solution would not only be an aid to table processing applications such as large volume table conve...

Cui Tao, David W. Embley

claim paper

Read More »

117

Voted

ACMICEC
2006
ACM

141views ECommerce» more ACMICEC 2006»

From HTML documents to web tables and rules

15 years 9 months ago

Download www.informatik.uni-freiburg.de

We present a browser-extending Semantic Web extraction system that maps HTML documents to tables and, where possible, to rules. First, the basic data extractor ViPER distills and ...

Kai Simon, Georg Lausen, Harold Boley

claim paper

Read More »

138

click to vote

SIGMOD
2004
ACM

142views Database» more SIGMOD 2004»

Understanding Web Query Interfaces: Best-Effort Parsing with Hidden Syntax

16 years 3 months ago

Download www-forward.cs.uiuc.edu

Recently, the Web has been rapidly "deepened" by many searchable databases online, where data are hidden behind query forms. For modelling and integrating Web databases,...

Zhen Zhang, Bin He, Kevin Chen-Chuan Chang

claim paper

Read More »

141

click to vote

SYNASC
2006
IEEE

211views Algorithms» more SYNASC 2006»

HTML Pattern Generator--Automatic Data Extraction from Web Pages

15 years 9 months ago

Download www.informatik.tu-cottbus.de

Existing methods of information extraction from HTML documents include manual approach, supervised learning and automatic techniques. The manual method has high precision and reca...

Mirel Cosulschi, Adrian Giurca, Bogdan Udrescu, Ni...

claim paper

Read More »

150

Voted

AND
2009

128views Machine Learning» more AND 2009»

Digital weight watching: reconstruction of scanned documents

15 years 1 months ago

Download ilps.science.uva.nl

A web-portal providing access to over 250.000 scanned and OCRed cultural heritage documents is analyzed. The collection consists of the complete Dutch Hansard from 1917 to 1995. E...

Tim Gielissen, Maarten Marx

claim paper

Read More »

« Prev « First page 8 / 47 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers