We present a system that classifies pixels in a document image according to marking type such as machine print, handwriting, and noise. A segmenter module first splits an input ...
Current approaches to script identification rely on hand-selected features and often require processing a significant part of the document to achieve reliable identification. We p...
Developing effective content recognition methods for diverse imagery continues to challenge computer vision researchers. We present a new approach for document image content catego...
Guangyu Zhu, Xiaodong Yu, Yi Li, David S. Doermann
A new approach for separating mathematics from usual text is presented. Contrary to the existing methods, it is more oriented toward the segmentation than the recognition, isolati...
—Recognition of document images having Greek polytonic (multi accent) characters is a challenging task due the large number of existing character classes (more than 270). In this...