The word error rate of any optical character recognition system (OCR) is usually substantially below its component or character error rate. This is especially true of Indic langua...
Venkat Rasagna, Anand Kumar 0002, C. V. Jawahar, R...
The aim of this work is to propose a new approach to the recognition of historical texts by providing an adaptive mechanism that automatically tunes itself to a specific book. Th...
Vladimir Kluzner, Asaf Tzadok, Yuval Shimony, Euge...
One essential issue of document clustering is to estimate the appropriate number of clusters for a document collection to which documents should be partitioned. In this paper, we ...
A technique is presented that uses visual relationships between word images in a document to improve the recognition of the text it contains. This technique takes advantage of the...
Named Entity (NE) recognition from the results of Automatic Speech Recognition (ASR) is challenging because of ASR errors. To detect NEs, one of the options is to use a statistica...