Sciweavers

ICDAR
2003
IEEE

Fast Lexicon-Based Word Recognition in Noisy Index Card Images

14 years 5 months ago
Fast Lexicon-Based Word Recognition in Noisy Index Card Images
This paper describes a complete system for reading typewritten lexicon words in noisy images - in this case museum index cards. The system is conceptually simple, and straightforward to implement. It involves three stages of processing. The first stage extracts row-regions from the image, where each row is a hypothesized line of text. The next stage scans an OCR classifier over each row image, creating a character hypothesis graph in the process. This graph is then searched using a priority-queue based algorithm for the best matches with a set of words (lexicon). Performance evaluation on a set of museum archive cards indicates competitive accuracy and also reasonable throughput. The priority queue algorithm is over two hundred times faster than using flat dynamic programming on these graphs.
Simon M. Lucas, Gregory Patoulas, Andy C. Downton
Added 04 Jul 2010
Updated 04 Jul 2010
Type Conference
Year 2003
Where ICDAR
Authors Simon M. Lucas, Gregory Patoulas, Andy C. Downton
Comments (0)