The larger amount of information on the Web is stored in document databases and is not indexed by general-purpose search engines (i.e., Google and Yahoo). Such information is dyna...
Yih-Ling Hedley, Muhammad Younas, Anne E. James, M...
Document representation and indexing is a key problem for document analysis and processing, such as clustering, classification and retrieval. Conventionally, Latent Semantic Index...
Okapi BM25 scoring of anchor text surrogate documents has been shown to facilitate effective ranking in navigational search tasks over web data. We hypothesize that even better r...
In this poster we present an overview of the techniques we used to develop and evaluate a text categorisation system for the PRINCIP project which sets out to automatically classi...
This paper describes the Patent Retrieval Task in the Fourth NTCIR Workshop, and the test collections produced in this task. We perform the invalidity search task, in which each p...
We investigate how users interact with the results page of a WWW search engine using eye-tracking. The goal is to gain into how users browse the presented abstracts and how they s...
We discuss a retrieval model in which the task is to complete a sentence, given an initial fragment, and given an application specific document collection. This model is motivate...
This paper presents a new dependence language modeling approach to information retrieval. The approach extends the basic language modeling approach based on unigram by relaxing th...
Jianfeng Gao, Jian-Yun Nie, Guangyuan Wu, Guihong ...
Empirical studies of information retrieval methods show that good retrieval performance is closely related to the use of various retrieval heuristics, such as TF-IDF weighting. On...
The Implicit Query (IQ) prototype is a system which automatically generates context-sensitive searches based on a user’s current computing activities. In the demo, we show IQ ru...
Susan T. Dumais, Edward Cutrell, Raman Sarin, Eric...