Modeling term dependence has been shown to have a significant positive impact on retrieval. Current models, however, use sequential term dependencies, leading to an increased query latency, especially for long queries. In this paper, we examine two query segmentation models that reduce the number of dependencies. We find that two-stage segmentation based on both query syntactic structure and external information sources such as query logs, attains retrieval performance comparable to the sequential dependence model, while achieving a 50% reduction in query latency. Categories and Subject Descriptors: H.3.3 [Information Search and Retrieval]: Query Formulation General Terms: Algorithms, Experimentation, Theory
Michael Bendersky, W. Bruce Croft, David A. Smith