Sciweavers

295 search results - page 15 / 59
» Web Crawling
Sort
View
CN
1998
54views more  CN 1998»
13 years 9 months ago
Efficient Crawling Through URL Ordering
In this paper we study in what order a crawler should visit the URLs it has seen, in order to obtain more "important" pages first. Obtaining important pages rapidly can ...
Junghoo Cho, Hector Garcia-Molina, Lawrence Page
HT
2006
ACM
14 years 3 months ago
Evaluation of crawling policies for a web-repository crawler
We have developed a web-repository crawler that is used for reconstructing websites when backups are unavailable. Our crawler retrieves web resources from the Internet Archive, Go...
Frank McCown, Michael L. Nelson
CORR
2012
Springer
292views Education» more  CORR 2012»
12 years 5 months ago
Optimal Threshold Control by the Robots of Web Search Engines with Obsolescence of Documents
A typical web search engine consists of three principal parts: crawling engine, indexing engine, and searching engine. The present work aims to optimize the performance of the cra...
Konstantin Avrachenkov, Alexander N. Dudin, Valent...
SIGIR
2002
ACM
13 years 9 months ago
Do TREC web collections look like the web?
We measure the WT10g test collection, used in the TREC-9 and TREC 2001 Web Tracks, and the .GOV test collection used in the TREC 2002 Web and Interactive Tracks, with common measu...
Ian Soboroff
ICAPR
2005
Springer
14 years 3 months ago
Combining Text and Link Analysis for Focused Crawling
The number of vertical search engines and portals has rapidly increased over the last years, making the importance of a topic-driven (focused) crawler evident. In this paper, we de...
George Almpanidis, Constantine Kotropoulos