Sciweavers

ICDAR
2003
IEEE

Document page similarity based on layout visual saliency: Application to query by example and document classification

14 years 4 months ago
Document page similarity based on layout visual saliency: Application to query by example and document classification
In this paper we propose to define a measure of visual similarity to compare different pages in a corpus. This measure is based on the analysis of the visual layout saliency of the page composition. This similarity is computed using both the document layout and characteristics of the text itself. The text characterization uses statistical features derived from textural primitives. Our purpose is to establish perceptive links between documents in order to facilitate their storage and their retrieval. In this paper we present two possible applications of this measure of similarity: the query of the corpus by example and the documents classification. In the first application, we extract documents that are the most visually similar to a document, given as query. In the second application, the similarity measure is used to classify the document under investigation using its visual similarity to a reference set of documents. Our test corpus is extracted from the Finland MTDB Oulu multi-genr...
Véronique Eglin, Stéphane Bres
Added 04 Jul 2010
Updated 04 Jul 2010
Type Conference
Year 2003
Where ICDAR
Authors Véronique Eglin, Stéphane Bres
Comments (0)