Beyond independent relevance: methods and evaluation metrics for subtopic retrieval

15 years 12 months ago

Download www.cs.cmu.edu

We present a non-traditional retrieval problem we call subtopic retrieval. The subtopic retrieval problem is concerned with ﬁnding documents that cover many different subtopics of a query topic. In such a problem, the utility of a document in a ranking is dependent on other documents in the ranking, violating the assumption of independent relevance which is assumed in most traditional retrieval methods. Subtopic retrieval poses challenges for evaluating performance, as well as for developing effective algorithms. We propose a framework for evaluating subtopic retrieval which generalizes the traditional precision and recall metrics by accounting for intrinsic topic difﬁculty as well as redundancy in documents. We propose and systematically evaluate several methods for performing subtopic retrieval using statistical language models and a maximal marginal relevance (MMR) ranking strategy. A mixture model combined with query likelihood relevance ranking is shown to modestly outperform...

ChengXiang Zhai, William W. Cohen, John D. Laffert

Real-time Traffic

Retrieval Problem | SIGIR 2003 | Subtopic Retrieval | Subtopic Retrieval Problem |

claim paper

Added	05 Jul 2010
Updated	05 Jul 2010
Type	Conference
Year	2003
Where	SIGIR
Authors	ChengXiang Zhai, William W. Cohen, John D. Lafferty

Sciweavers

Beyond independent relevance: methods and evaluation metrics for subtopic retrieval

Retrieval Problem | SIGIR 2003 | Subtopic Retrieval | Subtopic Retrieval Problem |

Explore & Download

Productivity Tools

Sciweavers