Sciweavers

ICDE
2007
IEEE

DSphere: A Source-Centric Approach to Crawling, Indexing and Searching the World Wide Web

15 years 1 months ago
DSphere: A Source-Centric Approach to Crawling, Indexing and Searching the World Wide Web
We describe DSPHERE1 - a decentralized system for crawling, indexing, searching and ranking of documents in the World Wide Web. Unlike most of the existing search technologies that depend heavily on a page-centric view of the Web, we advocate a source-centric view of the Web and propose a decentralized architecture for crawling, indexing and searching the Web in a distributed source-specific fashion. A fully decentralized crawler is developed to crawl the World Wide Web where each peer is assigned the responsibility of crawling a specific set of documents referred to as a source collection. Link analysis techniques are used for ranking documents. Traditional link analysis techniques suffer from problems like slow refresh rate and vulnerabilities to Web Spam, to counter which, we propose a source-based link analysis algorithm which computes fast and accurate ranking scores for all crawled documents.
Bhuvan Bamba, Ling Liu, James Caverlee, Vaibhav Pa
Added 01 Nov 2009
Updated 01 Nov 2009
Type Conference
Year 2007
Where ICDE
Authors Bhuvan Bamba, Ling Liu, James Caverlee, Vaibhav Padliya, Mudhakar Srivatsa, Tushar Bansal, Mahesh Palekar, Joseph Patrao, Suiyang Li, Aameek Singh
Comments (0)