Indexes for large collections are often divided into shards that are distributed across multiple computers and searched in parallel to provide rapid interactive search. Typically,...
Distributed applications or workflows need to access and use compute, storage and network resources simultaneously or chronologically coordinated respectively. Examples are distri...
Abstract. This paper presents PLDA, our parallel implementation of Latent Dirichlet Allocation on MPI and MapReduce. PLDA smooths out storage and computation bottlenecks and provid...
Yi Wang, Hongjie Bai, Matt Stanton, Wen-Yen Chen, ...
Swarm is a storage system that provides scalable, reliable, and cost-effective data storage. Swarm is based on storage servers, rather than file servers; the storage servers are o...
Peer-to-peer storage systems assume that their users consume resources in proportion to their contribution. Unfortunately, users are unlikely to do this without some enforcement m...