Relevance profiling is a general process for withindocument retrieval. Given a query, a profile of retrieval status values is computed by sliding a fixed sized window across a document. In this paper, we report a series of bench experiments on relevance profiling, using an existing electronic book, and its associated book index. The book index is the source of queries and relevance judgements for the experiments. Three weighting functions based on a language modelling approach are investigated, and we demonstrate that the well-known query generation model outperforms one based on the Kullback-Leibler divergence, and one based on simple term frequency. The relevance profiling process proved highly effective in retrieving relevant pages within the electronic book, and exhibits stable performance over a range of sliding window sizes. The experimental study provides evidence for the effectiveness of relevance profiling for within-document retrieval, with the caveat that the experiment was...
David J. Harper, David Lee