This paper describes TimeLine, an efficient archive service for a distributed storage system. TimeLine allows users to take snapshots on demand. The archive is stored online so th...
Abstract. This paper proposes and presents a comparison of scheduling algorithms applied to the context of load balancing the query traffic on distributed inverted files. We put em...
We describe the design and implementation of a high performance cloud that we have used to archive, analyze and mine large distributed data sets. By a cloud, we mean an infrastruc...
We identify crucial design issues in building a distributed inverted index for a large collection of web pages. We introduce a novel pipelining technique for structuring the core ...