Given a noisy text page, a word recognizer can generate a set of candidates for each word image. A relaxation algorithm was proposed previously by the authors that uses word collo...
Models of latent document semantics such as the mixture of multinomials model and Latent Dirichlet Allocation have received substantial attention for their ability to discover top...
Daniel David Walker, William B. Lund, Eric K. Ring...
Our proposed approach to text and line-art extraction requires accurately locating a text-string box and identifying external line vectors incident on the box. The results of extra...
Luyang Li, George Nagy, Ashok Samal, Sharad C. Set...
Current approaches to script identification rely on hand-selected features and often require processing a significant part of the document to achieve reliable identification. We p...
This paper gives an algorithm for detecting and reading text in natural images. The algorithm is intended for use by blind and visually impaired subjects walking through city scen...