Sciweavers

IC
2004

IPMicra: An IP-address based Location Aware Distributed Web Crawler

14 years 1 months ago
IPMicra: An IP-address based Location Aware Distributed Web Crawler
Distributed crawling is able to overcome important limitations of the traditional single-sourced web crawling systems. However, the optimal benefit of distributed crawling is usually limited to the sites hosting the crawlers, the rest of the URLs are by large randomly distributed to the various crawlers. In this work, we propose a location-aware method, called IPMicra, that utilizes an IP address hierarchy, and allows crawling of links in a near optimal location aware manner. Our proposal outperforms earlier distributed crawling schemes by requiring one order of magnitude less time for crawling of the same set of sites.
Odysseas Papapetrou, George Samaras
Added 31 Oct 2010
Updated 31 Oct 2010
Type Conference
Year 2004
Where IC
Authors Odysseas Papapetrou, George Samaras
Comments (0)