Nowadays, searching information in the web or in any kind of document collection has become one of the most frequent activities. However, user queries can be formulated in a way th...
This paper is concerned with automatic extraction of titles from the bodies of HTML documents. Titles of HTML documents should be correctly defined in the title fields; however, i...
Searching very large collections can be costly in both computation and storage. To reduce this cost, recent research has focused on reducing the size (pruning) of the inverted ind...
Recently, mean-variance analysis has been proposed as a novel paradigm to model document ranking in Information Retrieval. The main merit of this approach is that it diversifies t...
Guido Zuccon, Leif Azzopardi, Keith van Rijsbergen
In this report, we unify two quite distinct approaches to information retrieval: region models and language models. Region models were developed for structured document retrieval....