This paper describes a top-down word image generation model for holistic handwritten word recognition. To generate a word image, it uses likelihoods based, respectively, on a ling...
We describe a new corpus collected for comparative evaluation of OCR-software and postcorrection techniques. The corpus is freely available for academic groups and use. The major ...
Stoyan Mihov, Klaus U. Schulz, Christoph Ringlstet...
Despite ubiquitous claims that optical character recognition (OCR) is a "solved problem," many categories of documents continue to break modern OCR software such as docu...
—Retrieval from Hindi document image collections is a challenging task. This is partly due to the complexity of the script, which has more than 800 unique ligatures. In addition,...
Raman Jain, Volkmar Frinken, C. V. Jawahar, Raghav...
Line detection algorithms constitute the basis for technical document analysis and recognition. The performance of these algorithms decreases as the quality of the documents degra...