Abstract. Citation indexes are valuable tools for research, in part because they provide a means with which to measure the relative impact of articles in a collection of scientifi...
This paper considers the problem of identifying on the Web compound documents (cDocs) ? groups of web pages that in aggregate constitute semantically coherent information entities...
In this paper, we propose a new approach to discover informative contents from a set of tabular documents (or Web pages) of a Web site. Our system, InfoDiscoverer, first partition...
Abstract-- Inverted files have been very successful for document retrieval, but sponsored search is different. Inverted files are designed to find documents that match the query (a...
We address a specific enterprise document search scenario, where the information need is expressed in an elaborate manner. In our scenario, information needs are expressed using a...
Krisztian Balog, Wouter Weerkamp, Maarten de Rijke