Sciweavers

391 search results - page 25 / 79
» Finding and Extracting Data Records from Web Pages
Sort
View
WWW
2005
ACM
14 years 9 months ago
METEOR: metadata and instance extraction from object referral lists on the web
The Web has established itself as the largest public data repository ever available. Even though the vast majority of information on the Web is formatted to be easily readable by ...
Hasan Davulcu, Srinivas Vadrevu, Saravanakumar Nag...
PVLDB
2008
141views more  PVLDB 2008»
13 years 8 months ago
WebTables: exploring the power of tables on the web
The World-Wide Web consists of a huge number of unstructured documents, but it also contains structured data in the form of HTML tables. We extracted 14.1 billion HTML tables from...
Michael J. Cafarella, Alon Y. Halevy, Daisy Zhe Wa...
ER
2007
Springer
142views Database» more  ER 2007»
14 years 2 months ago
Automatic Hidden-Web Table Interpretation by Sibling Page Comparison
The longstanding problem of automatic table interpretation still illudes us. Its solution would not only be an aid to table processing applications such as large volume table conve...
Cui Tao, David W. Embley
LREC
2010
216views Education» more  LREC 2010»
13 years 10 months ago
BlogBuster: A Tool for Extracting Corpora from the Blogosphere
This paper presents BlogBuster, a tool for extracting a corpus from the blogosphere. The topic of cleaning arbitrary web pages with the goal of extracting a corpus from web data, ...
Georgios Petasis, Dimitrios Petasis
COMAD
2009
13 years 9 months ago
Querying for relations from the semi-structured Web
We present a class of web queries whose result is a multi-column relation instead of a collection of unstructured documents as in standard web search. The user specifies the query...
Sunita Sarawagi