Sciweavers

611 search results - page 4 / 123
» Random web crawls
Sort
View
COOPIS
2004
IEEE
13 years 11 months ago
Minimizing the Network Distance in Distributed Web Crawling
Abstract. Distributed crawling has shown that it can overcome important limitations of the centralized crawling paradigm. However, the distributed nature of current distributed cra...
Odysseas Papapetrou, George Samaras
WWW
2006
ACM
14 years 8 months ago
Geographically focused collaborative crawling
A collaborative crawler is a group of crawling nodes, in which each crawling node is responsible for a specific portion of the web. We study the problem of collecting geographical...
Weizheng Gao, Hyun Chul Lee, Yingbo Miao
DMKD
2004
ACM
121views Data Mining» more  DMKD 2004»
13 years 11 months ago
Discovery of ads web hosts through traffic data analysis
One of the most actual problems on web crawling
V. Bacarella, Fosca Giannotti, Mirco Nanni, Dino P...
WWW
2003
ACM
14 years 8 months ago
Distributed Indexing of the Web Using Migrating Crawlers
Due to the tremendous increase rate and the high change frequency of Web documents, maintaining an up-to-date index for searching purposes (search engines) is becoming a challenge...
Odysseas Papapetrou, Stavros Papastavrou, George S...
SIGIR
2009
ACM
14 years 2 months ago
The impact of crawl policy on web search effectiveness
Crawl selection policy has a direct influence on Web search effectiveness, because a useful page that is not selected for crawling will also be absent from search results. Yet th...
Dennis Fetterly, Nick Craswell, Vishwa Vinay