Sciweavers

299 search results - page 58 / 60
» User-centric Web crawling
Sort
View
WEBI
2007
Springer
14 years 2 months ago
Determining Bias to Search Engines from Robots.txt
Search engines largely rely on robots (i.e., crawlers or spiders) to collect information from the Web. Such crawling activities can be regulated from the server side by deploying ...
Yang Sun, Ziming Zhuang, Isaac G. Councill, C. Lee...
HPDC
2003
IEEE
14 years 2 months ago
Distributed Pagerank for P2P Systems
This paper defines and describes a fully distributed implementation of Google’s highly effective Pagerank algorithm, for “peer to peer”(P2P) systems. The implementation is ...
Karthikeyan Sankaralingam, Simha Sethumadhavan, Ja...
CLOUD
2010
ACM
14 years 1 months ago
Stateful bulk processing for incremental analytics
This work addresses the need for stateful dataflow programs that can rapidly sift through huge, evolving data sets. These data-intensive applications perform complex multi-step c...
Dionysios Logothetis, Christopher Olston, Benjamin...
MM
2009
ACM
203views Multimedia» more  MM 2009»
14 years 1 months ago
Distance metric learning from uncertain side information with application to automated photo tagging
Automated photo tagging is essential to make massive unlabeled photos searchable by text search engines. Conventional image annotation approaches, though working reasonably well o...
Lei Wu, Steven C. H. Hoi, Rong Jin, Jianke Zhu, Ne...
VLDB
1999
ACM
140views Database» more  VLDB 1999»
14 years 29 days ago
Distributed Hypertext Resource Discovery Through Examples
We describe the architecture of a hypertext resource discovery system using a relational database. Such a system can answer questions that combine page contents, metadata, and hyp...
Soumen Chakrabarti, Martin van den Berg, Byron Dom