We consider the problem of wide-area large-scale text search over a peer-to-peer infrastructure. A wide-area search infrastructure with billions of documents and millions of searc...
Vijay Gopalakrishnan, Bobby Bhattacharjee, Peter J...
Data deduplication has become a popular technology for reducing the amount of storage space necessary for backup and archival data. Content defined chunking (CDC) techniques are w...
DelosDLMS is a prototype of a next-generation Digital Library (DL) management system. It is the result of integrating various specialized DL services provided by partners of the D...
To exploit the similarity information hidden in the hyperlink structure of the web, this paper introduces algorithms scalable to graphs with billions of vertices on a distributed ...
Our current concern is a scalable infrastructure for information retrieval (IR) with up-to-date retrieval results in the presence of frequent, continuous updates. Timely processin...