This paper describes a novel approach to named entity (NE) tagging on degraded documents. NE tagging is the process of identifying salient text strings in unstructured text, corre...
With large databases of document images available, a method for users to find keywords in documents will be useful. One approach is to perform Optical Character Recognition (OCR) ...
—Retrieval from Hindi document image collections is a challenging task. This is partly due to the complexity of the script, which has more than 800 unique ligatures. In addition,...
Raman Jain, Volkmar Frinken, C. V. Jawahar, Raghav...
It is a well documented fact that, for human readers, familiar text is more legible than unfamiliar text. Current-generation computer vision systems also are able to exploit some ...
Many document-based applications, including popular Web browsers, email viewers, and word processors, have a ‘Find on this Page’ feature that allows a user to find every occur...
Kevyn Collins-Thompson, Charles Schweizer, Susan T...