Sciweavers

203 search results - page 17 / 41
» Conceptual-Model-Based Data Extraction from Multiple-Record ...
Sort
View
WWW
2005
ACM
14 years 8 months ago
METEOR: metadata and instance extraction from object referral lists on the web
The Web has established itself as the largest public data repository ever available. Even though the vast majority of information on the Web is formatted to be easily readable by ...
Hasan Davulcu, Srinivas Vadrevu, Saravanakumar Nag...
AIRWEB
2007
Springer
14 years 1 months ago
Extracting Link Spam using Biased Random Walks from Spam Seed Sets
Link spam deliberately manipulates hyperlinks between web pages in order to unduly boost the search engine ranking of one or more target pages. Link based ranking algorithms such ...
Baoning Wu, Kumar Chellapilla
ER
2007
Springer
142views Database» more  ER 2007»
14 years 1 months ago
Automatic Hidden-Web Table Interpretation by Sibling Page Comparison
The longstanding problem of automatic table interpretation still illudes us. Its solution would not only be an aid to table processing applications such as large volume table conve...
Cui Tao, David W. Embley
LREC
2010
216views Education» more  LREC 2010»
13 years 9 months ago
BlogBuster: A Tool for Extracting Corpora from the Blogosphere
This paper presents BlogBuster, a tool for extracting a corpus from the blogosphere. The topic of cleaning arbitrary web pages with the goal of extracting a corpus from web data, ...
Georgios Petasis, Dimitrios Petasis
KDD
1997
ACM
169views Data Mining» more  KDD 1997»
13 years 11 months ago
Learning to Extract Text-Based Information from the World Wide Web
Thereis a wealthof informationto be minedfromnarrative text on the WorldWideWeb.Unfortunately, standard natural language processing (NLP)extraction techniques expect full, grammat...
Stephen Soderland