Sciweavers

1133 search results - page 5 / 227
» Distributed community crawling
Sort
View
ICDE
2007
IEEE
167views Database» more  ICDE 2007»
14 years 8 months ago
DSphere: A Source-Centric Approach to Crawling, Indexing and Searching the World Wide Web
We describe DSPHERE1 - a decentralized system for crawling, indexing, searching and ranking of documents in the World Wide Web. Unlike most of the existing search technologies tha...
Bhuvan Bamba, Ling Liu, James Caverlee, Vaibhav Pa...
WWW
2011
ACM
13 years 2 months ago
we.b: the web of short urls
Short URLs have become ubiquitous. Especially popular within social networking services, short URLs have seen a significant increase in their usage over the past years, mostly du...
Demetres Antoniades, Iasonas Polakis, Georgios Kon...
WWW
2007
ACM
14 years 8 months ago
Detecting near-duplicates for web crawling
Near-duplicate web documents are abundant. Two such documents differ from each other in a very small portion that displays advertisements, for example. Such differences are irrele...
Gurmeet Singh Manku, Arvind Jain, Anish Das Sarma
HICSS
1999
IEEE
178views Biometrics» more  HICSS 1999»
13 years 11 months ago
Collaborative Web Crawling: Information Gathering/Processing over Internet
The main objective of the IBM Grand Central Station (GCS) is to gather information of virtually any type of formats (text, data, image, graphics, audio, video) from the cyberspace...
Shang-Hua Teng, Qi Lu, Matthias Eichstaedt, Daniel...
ERCIMDL
2005
Springer
124views Education» more  ERCIMDL 2005»
14 years 27 days ago
A Comparison of On-Line Computer Science Citation Databases
This paper examines the difference and similarities between the two on-line computer science citation databases DBLP and CiteSeer. The database entries in DBLP are inserted manual...
Vaclav Petricek, Ingemar J. Cox, Hui Han, Isaac G....