Sciweavers

295 search results - page 57 / 59
» Web Crawling
Sort
View
WWW
2005
ACM
14 years 10 months ago
Fully automatic wrapper generation for search engines
When a query is submitted to a search engine, the search engine returns a dynamically generated result page containing the result records, each of which usually consists of a link...
Hongkun Zhao, Weiyi Meng, Zonghuan Wu, Vijay Ragha...
HPDC
2003
IEEE
14 years 3 months ago
Distributed Pagerank for P2P Systems
This paper defines and describes a fully distributed implementation of Google’s highly effective Pagerank algorithm, for “peer to peer”(P2P) systems. The implementation is ...
Karthikeyan Sankaralingam, Simha Sethumadhavan, Ja...
CLOUD
2010
ACM
14 years 2 months ago
Stateful bulk processing for incremental analytics
This work addresses the need for stateful dataflow programs that can rapidly sift through huge, evolving data sets. These data-intensive applications perform complex multi-step c...
Dionysios Logothetis, Christopher Olston, Benjamin...
WEBI
2007
Springer
14 years 3 months ago
Determining Bias to Search Engines from Robots.txt
Search engines largely rely on robots (i.e., crawlers or spiders) to collect information from the Web. Such crawling activities can be regulated from the server side by deploying ...
Yang Sun, Ziming Zhuang, Isaac G. Councill, C. Lee...
ISPAN
2005
IEEE
14 years 3 months ago
Supervised Peer-to-Peer Systems
In this paper we present a general methodology for designing supervised peer-to-peer systems. A supervised peer-to-peer system is a system in which the overlay network is formed b...
Kishore Kothapalli, Christian Scheideler