Sciweavers

ICDAR
1999
IEEE

Document Image Layout Comparison and Classification

14 years 3 months ago
Document Image Layout Comparison and Classification
This paper describes features and methods for document image comparison and classification at the spatial layout level. The methods are useful for visual similarity based document retrieval as well as fast algorithms for initial document type classification without OCR. A novel feature set called interval encoding is introduced to capture elements of spatial layout. This feature set encodes region layout information in fixed-length vectors which can be used for fast page layout comparison. The paper describes experiments and results to rank-order a set of document pagesin terms of their layout similarity to a test document. We also demonstrate the usefulness of the features derived from interval encoding in a hidden Markov model based page layout classification system that is trainable and extendible.
Jianying Hu, Ramanujan S. Kashi, Gordon T. Wilfong
Added 03 Aug 2010
Updated 03 Aug 2010
Type Conference
Year 1999
Where ICDAR
Authors Jianying Hu, Ramanujan S. Kashi, Gordon T. Wilfong
Comments (0)