Sciweavers

ICDAR
2007
IEEE

A Weighted Finite-State Framework for Correcting Errors in Natural Scene OCR

14 years 5 months ago
A Weighted Finite-State Framework for Correcting Errors in Natural Scene OCR
With the increasing market of cheap cameras, natural scene text has to be handled in an efficient way. Some works deal with text detection in the image while more recent ones point out the challenge of text extraction and recognition. We propose here an OCR correction system to handle traditional issues of recognizer errors but also the ones due to natural scene images, i.e. cut characters, artistic display, uncomplete sentences (present in advertisements) and outof-vocabulary (OOV) words such as acronyms and so on. The main algorithm bases on Finite-State Machines (FSMs) to deal with learned OCR confusions, capital/accented letters and lexicon look-up. Moreover, as OCR is not considered as a black box, several outputs are taken into account to intermingle recognition and correction steps. Based on a public database of natural scene words, detailed results are also presented along with future works.
R. Beaufort, Céline Mancas-Thillou
Added 03 Jun 2010
Updated 03 Jun 2010
Type Conference
Year 2007
Where ICDAR
Authors R. Beaufort, Céline Mancas-Thillou
Comments (0)