In this paper, we introduce a generative probabilistic optical character recognition (OCR) model that describes an end-to-end process in the noisy channel framework, progressing f...
We compare in this study two image restoration approaches for the pre-processing of printed documents: namely the Non-local Means filter and a total variation minimization approac...
The main problems of Optical Character Recognition (OCR) systems are solved if printed latin text is considered. Since OCR systems are based upon binary images, their results are ...
A number of projects are creating searchable digital libraries of printed books. These include the Million Book Project, the Google Book project and similar efforts from Yahoo an...
This paper introduces a multifont classification scheme to help recognition of multifont and multisize characters. It uses typographical attributes such as ascenders, descenders a...