In information retrieval, queries can fail to find documents due to mismatch in terminology. Query expansion is a well-known technique addressing this problem, where additional q...
As online document collections continue to expand, both on the Web and in proprietary environments, the need for duplicate detection becomes more critical. The goal of this work i...
Modern retrieval test collections are built through a process called pooling in which only a sample of the entire document set is judged for each topic. The idea behind pooling is...
Chris Buckley, Darrin Dimmick, Ian Soboroff, Ellen...
The classical (ad hoc) document retrieval problem has been traditionally approached through ranking according to heuristically developed functions (such as tf.idf or bm25) or gene...