

Text Lines and Snippets Extraction for 19th Century Handwriting Documents Layout Analysis

14 years 9 months ago
Text Lines and Snippets Extraction for 19th Century Handwriting Documents Layout Analysis
In this paper we propose a new approach to improve electronic editions of human science corpus, providing an efficient estimation of manuscripts pages structure. In any handwriting documents analysis process, the text line segmentation is an important stage. The presence of variable inter-line spaces, of inconstant base-line skews, overlapping and occlusions in unconstrained ancient 19th handwritten documents complexifies the text lines segmentation task. In this paper, we only use as prior knowledge of script the fact that text lines skews can be random and irregular. In that context, we model text line detection as an image segmentation problem by enhancing text line structure using Hough transform and a clustering of connected components so as to make text line boundaries appear. The proposed approach of snippets decomposition for page layout analysis lies on a first step of content pages classification in five visual and genetic taxonomies, and a second step of text line extr...
Vincent Malleron, Véronique Eglin, Hubert E
Added 21 May 2010
Updated 21 May 2010
Type Conference
Year 2009
Authors Vincent Malleron, Véronique Eglin, Hubert Emptoz, Stéphanie Dord-Crouslé, Philippe Régnier
Comments (0)