Query reformulation modifies the original query with the aim of better matching the vocabulary of the relevant documents, and consequently improving ranking effectiveness. Previous techniques typically generate words and phrases related to the original query, but do not consider how these words and phrases would fit together in new queries. In this paper, we focus on an implementation of an approach that models reformulation as a distribution of queries, where each query is a variation of the original query. This approach considers a query as a basic unit and can capture important dependencies between words and phrases in the query. The implementation discussed here is based on passage analysis of the target corpus. Experiments on the TREC collection show that the proposed model for query reformulation significantly outperforms state-of-the-art methods. Categories and Subject Descriptors H.3.3 [Information Storage and Retrieval]: Information Search and Retrieval General Terms Algorith...
Xiaobing Xue, W. Bruce Croft, David A. Smith