: We describe our participation in the TREC 2003 Robust and Web tracks. For the Robust track, we experimented with the impact of stemming and feedback on the worst scoring topics. ...
Jaap Kamps, Christof Monz, Maarten de Rijke, B&oum...
Document representations can rapidly become unwieldy if they try to encapsulate all possible document properties, ranging tract structure to detailed rendering and layout. We pres...
Deriving a thematically meaningful partition of an unlabeled document corpus is a challenging task. In this context, the use of document representations based on latent thematic ge...
In this paper we tackle the problem of document image retrieval by combining a similarity measure between documents and the probability that a given document belongs to a certain ...
Albert Gordo, Jaume Gibert, Ernest Valveny, Mar&cc...
This paper investigates the pre-conditions for successful combination of document representations formed from structural markup for the task of known-item search. As this task is ...
— The extension approach of frequent itemset mining can be applied to discover the relations among documents. Several schemes, i.e., n-gram, stemming, stopword removal and term w...