Sciweavers

ICDAR
1999
IEEE

MergeLayouts: Overcoming Faulty Segmentations by a Comprehensive Voting of Commercial OCR Devices

14 years 3 months ago
MergeLayouts: Overcoming Faulty Segmentations by a Comprehensive Voting of Commercial OCR Devices
In this paper, we will present a comprehensive voting approach, taking entire layouts obtained from commercial OCR devices as input. Such a layout comprises segments of three kinds: lines, words, and characters. By combining all attributes of a segment (e.g. recognized text, font height etc.), we attain a "better" layout, representing the original page layout as good as possible. The voting process itself is hierarchically organized, starting with the line segments. For each level, a search tree is spawn and all fellow segments (segments from different layouts which denote the same image area) are established. A heuristic search method is utilized which is guided by a similarity measure defined on segments. Deviations in the segmentation, as well as segmentation errors of individual commercial OCR devices, are compensated by an "equalization module".
Stefan Klink, Thorsten Jäger
Added 03 Aug 2010
Updated 03 Aug 2010
Type Conference
Year 1999
Where ICDAR
Authors Stefan Klink, Thorsten Jäger
Comments (0)