Web search engines crawl the web to fetch the data that they index. In this paper we re-examine that need, and evaluate the network costs associated with data acquisition, and alt...
Nick Craswell, Francis Crimmins, David Hawking, Al...
—The key to Deep Web crawling is to submit promising keywords to query form and retrieve Deep Web content efficiently. To select keywords, existing methods make a decision based ...
While the Web has been increasingly recognized as a culturally valuable social artifact, many nations endeavor to create national Web archives for long term preservation. However, ...
In ongoing research, a collaborative peer network application is being proposed to address the scalability limitations of centralized search engines. Here we introduce a local ada...
Current-day crawlers retrieve content only from the publicly indexable Web, i.e., the set of Web pages reachable purely by following hypertext links, ignoring search forms and pag...