Indexes for large collections are often divided into shards that are distributed across multiple computers and searched in parallel to provide rapid interactive search. Typically,...
Spoken document retrieval (SDR) has been extensively studied in recent years because of its potential use in navigating large multimedia collections in the near future. This paper...
There has been a great amount of work on query-independent summarization of documents. However, due to the success of Web search engines query-specific document summarization (que...
In this paper, we argue that the agglomerative clustering with vector cosine similarity measure performs poorly due to two reasons. First, the nearest neighbors of a document belo...