Search Sciweavers | Sciweavers

543 search results - page 9 / 109

» Exploiting content redundancy for web information extraction

196

click to vote

SIGKDD
2008

248views more SIGKDD 2008»

Web data mining: exploring hyperlinks, contents, and usage data

15 years 5 months ago

Download www.sigkdd.org

This paper presents a review of the book "Web Data Mining - Exploring Hyperlinks, Contents, and Usage Data" by Bing Liu. The review concludes that the breadth and depth ...

Olfa Nasraoui

claim paper

Read More »

141

click to vote

KDD
2004
ACM

163views Data Mining» more KDD 2004»

Exploiting dictionaries in named entity extraction: combining semi-Markov extraction processes and data integration methods

16 years 5 months ago

Download www.cs.cmu.edu

We consider the problem of improving named entity recognition (NER) systems by using external dictionaries--more specifically, the problem of extending state-of-the-art NER system...

William W. Cohen, Sunita Sarawagi

claim paper

Read More »

178

click to vote

ICDM
2008
IEEE

186views Data Mining» more ICDM 2008»

xCrawl: A High-Recall Crawling Method for Web Mining

15 years 11 months ago

Download ls13-www.cs.uni-dortmund.de

Web Mining Systems exploit the redundancy of data published on the Web to automatically extract information from existing web documents. The ﬁrst step in the Information Extract...

Kostyantyn M. Shchekotykhin, Dietmar Jannach, Gerh...

claim paper

Read More »

137

click to vote

SOFSEM
2007
Springer

156views Theoretical Computer Science» more SOFSEM 2007»

Creating Permanent Test Collections of Web Pages for Information Extraction Research

15 years 11 months ago

Download www.dbai.tuwien.ac.at

In the research area of automatic web information extraction, there is a need for permanent and annotated web page collections enabling objective performance evaluation of differen...

Bernhard Pollak, Wolfgang Gatterbauer

claim paper

Read More »

155

click to vote

LPNMR
2001
Springer

203views Automated Reasoning» more LPNMR 2001»

Declarative Information Extraction, Web Crawling, and Recursive Wrapping with Lixto

15 years 9 months ago

Download lsirpeople.epfl.ch

Lixto is a system and method for the visual and interactive generation of wrappers for Web pages under the supervision of a human developer, for automatically extracting informatio...

Robert Baumgartner, Sergio Flesca, Georg Gottlob

claim paper

Read More »

« Prev « First page 9 / 109 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers