Large collections of scanned documents (books and journals) are now available in Digital Libraries. The most common method for retrieving relevant information from these collectio...
Skew estimation and page segmentation are the two closely related processing stages for document image analysis. Skew estimation needs proper page segmentation, especially for doc...
A hierarchical algorithm is presented for determining the similarity and equivalence of document images. Features extracted from the CCIIT fax-compressed representations of two im...
This contribution proposes a compositionality architecture for visual object categorization, i.e., learning and recognizing multiple visual object classes in unsegmented, cluttered...
In this paper, we present a novel framework for machine learning-based cross-media knowledge extraction. The framework is specifically designed to handle documents composed of th...