—A vast number of historical and badly degraded document images can be found in libraries, public, and national archives. Due to the complex nature of different artifacts, such poor quality documents are hard to read and to process. In this paper, a novel adaptive binarization algorithm using ternary entropy-based approach is proposed. Given an input image, the contrast of intensity is first estimated by a grayscale morphological closing operator. A double-threshold is generated by our Shannon entropybased ternarizing method to classify pixels into text, near-text, and non-text regions. The pixels in the second region are relabeled by the local mean and the standard deviation. Our proposed method classifies noise into two categories which are processed by binary morphological operators, shrink and swell filters, and graph searching strategy. The method is tested with three databases that have been used in the Document Image Binarization Contest 2009 (DIBCO 2009), the Handwriting Docu...
T. Hoang Ngan Le, Tien D. Bui, Ching Y. Suen