Top-down and bottom-up cues for scene text recognition

12 years 3 months ago

Download www.di.ens.fr

Scene text recognition has gained signiﬁcant attention from the computer vision community in recent years. Recognizing such text is a challenging problem, even more so than the recognition of scanned documents. In this work, we focus on the problem of recognizing text extracted from street images. We present a framework that exploits both bottom-up and top-down cues. The bottom-up cues are derived from individual character detections from the image. We build a Conditional Random Field model on these detections to jointly model the strength of the detections and the interactions between them. We impose top-down cues obtained from a lexicon-based prior, i.e. language statistics, on the model. The optimal word represented by the text image is obtained by minimizing the energy function corresponding to the random ﬁeld model. We show signiﬁcant improvements in accuracies on two challenging public datasets, namely Street View Text (over 15%) and ICDAR 2003 (nearly 10%).

Anand Mishra, Karteek Alahari, C. V. Jawahar

Real-time Traffic

Computer Vision | Computer Vision Community | Cvpr 2012 | Public Datasets | Random Field Model |

claim paper

Post Info
More Details (n/a)

Added	28 Sep 2012
Updated	28 Sep 2012
Type	Journal
Year	2012
Where	CVPR
Authors	Anand Mishra, Karteek Alahari, C. V. Jawahar

Comments (0)

Sciweavers

Top-down and bottom-up cues for scene text recognition

Computer Vision | Computer Vision Community | Cvpr 2012 | Public Datasets | Random Field Model |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers