Positional ranking functions, widely used in web search engines, improve result quality by exploiting the positions of the query terms within documents. However, it is well known ...
We propose a new method to partition an unlabeled dataset, called Discriminative Context Partitioning (DCP). It is motivated by the idea of splitting the dataset based only on how...
This paper describes our approach to the 2006 Adhoc Monolingual Information Retrieval run for French. The goal of our experiment was to compare the performance of a proposed stati...
The TREC .GOV collection makes a valuable web testbed for distributed information retrieval methods because it is naturally partitioned and includes 725 web-oriented queries with ...
Scalable similarity search is the core of many large scale learning or data mining applications. Recently, many research results demonstrate that one promising approach is creatin...