XML has become a popular method of data representation both on the web and in databases in recent years. One of the reasons for the popularity of XML has been its ability to encod...
Charu C. Aggarwal, Na Ta, Jianyong Wang, Jianhua F...
Word searching and indexing in historical document collections is a challenging problem because, characters in these documents are often touching or broken due to degradation/agei...
We report an automatic feature discovery method that achieves results comparable to a manually chosen, larger feature set on a document image content extraction problem: the locat...
In this paper, we present a novel graph-based method for extracting handwritten text lines in monochromatic Arabic document images. Our approach consists of two steps Coarse text ...
Jayant Kumar, Wael Abd-Almageed, Le Kang, David S....
Over the past two decades a significant number of layout analysis (page segmentation and region classification) approaches have been proposed in the literature. Each approach has b...