Keeping track of changes in user interests from a document stream with a few relevance judgments is not an easy task. To tackle this problem, we propose a novel method that integr...
Evaluation of IR systems has always been difficult because of the need for manually assessed relevance judgments. The advent of large editor-driven taxonomies on the web opens the...
Steven M. Beitzel, Eric C. Jensen, Abdur Chowdhury...
The empirical investigation of the effectiveness of information retrieval (IR) systems requires a test collection, a set of query topics, and a set of relevance judgments made by ...
Soboroff, Nicholas and Cahan recently proposed a method for evaluating the performance of retrieval systems without relevance judgments. They demonstrated that the system evaluat...
Forming test collection relevance judgments from the pooled output of multiple retrieval systems has become the standard process for creating resources such as the TREC, CLEF, and...
This paper examines the reliability of implicit feedback generated from clickthrough data in WWW search. Analyzing the users’ decision process using eyetracking and comparing im...
Thorsten Joachims, Laura A. Granka, Bing Pan, Hele...
We consider the problem of evaluating retrieval systems using a limited number of relevance judgments. Recent work has demonstrated that one can accurately estimate average precis...
Well tuned Large-Vocabulary Continuous Speech Recognition (LVCSR) has been shown to generally be more effective than vocabulary-independent techniques for ranked retrieval of spo...
As document collections grow larger, the information needs and relevance judgments in a test collection must be well-chosen within a limited budget to give the most reliable and ro...
Ben Carterette, Virgiliu Pavlu, Evangelos Kanoulas...
Abstract. Collecting relevance judgments (qrels) is an especially challenging part of building an information retrieval test collection. This paper presents a novel method for crea...