Sciweavers

611 search results - page 10 / 123
» Random web crawls
Sort
View
ADC
2004
Springer
79views Database» more  ADC 2004»
14 years 1 months ago
Performance and Cost Tradeoffs in Web Search.
Web search engines crawl the web to fetch the data that they index. In this paper we re-examine that need, and evaluate the network costs associated with data acquisition, and alt...
Nick Craswell, Francis Crimmins, David Hawking, Al...
WEBI
2009
Springer
14 years 2 months ago
Learning Deep Web Crawling with Diverse Features
—The key to Deep Web crawling is to submit promising keywords to query form and retrieve Deep Web content efficiently. To select keywords, existing methods make a decision based ...
Lu Jiang, Zhaohui Wu, Qinghua Zheng, Jun Liu
ICDE
2006
IEEE
144views Database» more  ICDE 2006»
14 years 1 months ago
Finding Thai Web Pages in Foreign Web Spaces
While the Web has been increasingly recognized as a culturally valuable social artifact, many nations endeavor to create national Web archives for long term preservation. However, ...
Kulwadee Somboonviwat, Takayuki Tamura, Masaru Kit...
WWW
2004
ACM
14 years 8 months ago
Small world peer networks in distributed web search
In ongoing research, a collaborative peer network application is being proposed to address the scalability limitations of centralized search engines. Here we introduce a local ada...
Ruj Akavipat, Le-Shin Wu, Filippo Menczer
WWW
2001
ACM
14 years 8 months ago
Crawling the Hidden Web
Current-day crawlers retrieve content only from the publicly indexable Web, i.e., the set of Web pages reachable purely by following hypertext links, ignoring search forms and pag...
Sriram Raghavan, Hector Garcia-Molina