Sciweavers

VLDB
2004
ACM

Accurate and Efficient Crawling for Relevant Websites

14 years 5 months ago
Accurate and Efficient Crawling for Relevant Websites
Focused web crawlers have recently emerged as an alternative to the well-established web search engines. While the well-known focused crawlers retrieve relevant webpages, there are various applications which target whole websites instead of single webpages. For example, companies are represented by websites, not by individual webpages. To answer queries targeted at websites, web directories are an established solution. In this paper, we introduce a novel focused website crawler to employ the paradigm of focused crawling for the search of relevant websites. The proposed crawler is based on a two-level architecture and corresponding crawl strategies with an explicit concept of websites. The external crawler views the web as a graph of linked websites, selects the websites to be examined next and invokes internal crawlers. Each internal crawler views the webpages of a single given website and performs focused (page) crawling within that website. Our experimental evaluation demonstrates t...
Martin Ester, Hans-Peter Kriegel, Matthias Schuber
Added 02 Jul 2010
Updated 02 Jul 2010
Type Conference
Year 2004
Where VLDB
Authors Martin Ester, Hans-Peter Kriegel, Matthias Schubert
Comments (0)