This paper considers the problem of identifying on the Web compound documents (cDocs) ? groups of web pages that in aggregate constitute semantically coherent information entities...
This paper presents a new pooling method for constructing the assessment sets used in the evaluation of retrieval systems. Our proposal is based on RankBoost, a machine learning v...
This paper presents two sentence retrieval methods. We adopt the task definition done in the TREC Novelty Track: sentence retrieval consists in the extraction of the relevant sente...
Multimedia documents are collections of media objects, synchronized by means of sets of temporal and spatial constraints. Any multimedia document definition is valid as long as t...
Paola Bertolotti, Ombretta Gaggi, Maria Luisa Sapi...
In this paper we present a novel method for removing page rule lines in monochromatic handwritten Arabic documents using subspace methods with minimal effect on the quality of the...
Wael Abd-Almageed, Jayant Kumar, David S. Doermann