The problem of character recognition in a book should be formulated significantly different from that of a single page or word. An ideal approach to design such a recognizer is to adapt the classifier to the font and style of the collection. In this paper, we propose an adaptation framework to recognize characters in a book with a learning framework. In the proposed system, the post processor verifies the output of the recognition module, which is further used for learning and thus to improve the performance over iteration. Experiments are conducted on about 500,000 annotated symbols from five books in Malayalam (an Indian language). We achieve an average improvement of 14% in classification accuracy.
C. V. Jawahar, N. V. Neeba