Ranking documents in a selected corpus plays an important role in information retrieval systems. Despite notable advances in this direction, with continuously accumulating text documents, maintaining up-to-date ordering among documents in the domains of interest is a challenging task. Conventional approaches can produce an ordering that is only valid within a given corpus. Thus, with such approaches, ordering should be completely redone as documents are added to or deleted from the corpus. In this paper, we introduce a corpusindependent framework for rapid ordering of documents in a dynamically changing corpus. Like in many practical approaches, our framework suggests utilizing a similarity measure in some metric space indicating the degree of relevance of a document to the domain of interest. However, unlike in corpusdependent approaches, the relevance score of a document remains valid with changes being introduced into the corpus (insertion of new documents, for example), thus allow...
Byung-Hoon Park, Nagiza F. Samatova, Rajesh Munava