Search Sciweavers | Sciweavers

72 search results - page 2 / 15

» Ontology-Focused Crawling of Web Documents

click to vote

VLDB
2000
ACM

125views Database» more VLDB 2000»

Focused Crawling Using Context Graphs

13 years 10 months ago

Download clgiles.ist.psu.edu

Maintaining currency of search engine indices by exhaustive crawling is rapidly becoming impossible due to the increasing size and dynamic content of the web. Focused crawlers aim...

Michelangelo Diligenti, Frans Coetzee, Steve Lawre...

claim paper

Read More »

click to vote

ICDM
2008
IEEE

186views Data Mining» more ICDM 2008»

xCrawl: A High-Recall Crawling Method for Web Mining

14 years 1 months ago

Download ls13-www.cs.uni-dortmund.de

Web Mining Systems exploit the redundancy of data published on the Web to automatically extract information from existing web documents. The ﬁrst step in the Information Extract...

Kostyantyn M. Shchekotykhin, Dietmar Jannach, Gerh...

claim paper

Read More »

click to vote

WWW
2003
ACM

183views Internet Technology» more WWW 2003»

Distributed Indexing of the Web Using Migrating Crawlers

14 years 7 months ago

Download softsys.cs.uoi.gr

Due to the tremendous increase rate and the high change frequency of Web documents, maintaining an up-to-date index for searching purposes (search engines) is becoming a challenge...

Odysseas Papapetrou, Stavros Papastavrou, George S...

claim paper

Read More »

click to vote

ADMA
2009
Springer

142views Data Mining» more ADMA 2009»

Crawling Deep Web Using a New Set Covering Algorithm

14 years 1 months ago

Download cs.uwindsor.ca

Abstract. Crawling the deep web often requires the selection of an appropriate set of queries so that they can cover most of the documents in the data source with low cost. This ca...

Yan Wang, Jianguo Lu, Jessica Chen

claim paper

Read More »

click to vote

WWW
2007
ACM

162views Internet Technology» more WWW 2007»

Detecting near-duplicates for web crawling

14 years 7 months ago

Download infolab.stanford.edu

Near-duplicate web documents are abundant. Two such documents differ from each other in a very small portion that displays advertisements, for example. Such differences are irrele...

Gurmeet Singh Manku, Arvind Jain, Anish Das Sarma

claim paper

Read More »

« Prev « First page 2 / 15 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers