This paper presents a new enhanced text extraction algorithm from degraded document images on the basis of the probabilistic models. The observed document image is considered as a...
This paper proposes a novel dewarping technique for document images of bound volumes. This technique is a kind of model fitting techniques for estimating the warp of each text li...
We present a document understanding system in which the arrangement of lines of text and block separators within a document are modeled by stochastic context free grammars. A gram...
John C. Handley, Anoop M. Namboodiri, Richard Zani...
In this paper, we propose an accurate and suitable designed system for complex documents segmentation. This system is based on steerable pyramid transform. The features extracted ...
Revealing and being able to manipulate the structured content of PDF documents is a difficult task, requiring pre-processing and reverse engineering techniques. In this paper, we ...