—A vast number of historical and badly degraded document images can be found in libraries, public, and national archives. Due to the complex nature of different artifacts, such p...
Detection of curled textline is important for dewarping of hand-held camera-captured document images. Then baselines and the lines following the top of x-height of characters (x-l...
Syed Saqib Bukhari, Faisal Shafait, Thomas M. Breu...
This paper presents the XML-based formats ALTO, TEI, METS used for Digital Libraries and their interest for data representation in a Document Image Analysis and Recognition (DIAR)...
In this paper, we present a new approach to extracting the target text line from a document image captured by a pen scanner. Given the binary image, a set of possible text lines a...
We propose a method for constructing a vector for a document image to represent its content to facilitate text retrieval. The method is based on an N-Gram algorithm for text simil...