A Complete Optical Character Recognition Methodology for Historical Documents

15 years 8 months ago

Download users.iit.demokritos.gr

In this paper a complete OCR methodology for recognizing historical documents, either printed or handwritten without any knowledge of the font, is presented. This methodology consists of three steps: The first two steps refer to creating a database for training using a set of documents, while the third one refers to recognition of new document images. First, a pre-processing step that includes image binarization and enhancement takes place. At a second step a top down segmentation approach is used in order to detect text lines, words and characters. A clustering scheme is then adopted in order to group characters of similar shape. This is a semi-automatic procedure since the user is able to interact at any time in order to correct possible errors of clustering and assign an ASCII label. After this step, a database is created in order to be used for recognition. Finally, in the third step, for every new document image the above segmentation approach takes place while the recognition is...

Georgios Vamvakas, Basilios Gatos, Nikolaos Stamat

Real-time Traffic

Complete Ocr Methodology | DAS 2008 | Document Analysis | Enhancement Takes Place | Segmentation Approach |

claim paper

» A comprehensive evaluation methodology for noisy historical document recognition technique...

» Character Enhancement for Historical Newspapers Printed Using Hot Metal Typesetting

» A Novel Feature Extraction and Classification Methodology for the Recognition of Historica...

» Unsupervised Evaluation Methods Based on Local GrayIntensity Variances for Binarization of...

» An Open Source Tesseract Based Optical Character Recognizer for Bangla Script

» Accessing the content of Greek historical documents

» Word matching using single closed contours for indexing handwritten historical documents

» Motion Deblurring for Optical Character Recognition

Post Info
More Details (n/a)

Added	19 Oct 2010
Updated	19 Oct 2010
Type	Conference
Year	2008
Where	DAS
Authors	Georgios Vamvakas, Basilios Gatos, Nikolaos Stamatopoulos, Stavros J. Perantonis

Comments (0)

Sciweavers

A Complete Optical Character Recognition Methodology for Historical Documents

Complete Ocr Methodology | DAS 2008 | Document Analysis | Enhancement Takes Place | Segmentation Approach |

Explore & Download

Productivity Tools

Sciweavers