Improved Degraded Document Recognition with Hybrid Modeling Techniques and Character N-Grams

15 years 1 months ago

Download www.mmk.ei.tum.de

In this paper a robust multifont character recognition system for degraded documents such as photocopy or fax is described. The system is based on Hidden Markov Models (HMMs) using discrete and hybrid modeling techniques, where the latter makes use of an information theory-based neural network. The presented recognition results refer to the SEDAL-database of English documents using no dictionary. It is also demonstrated that the usage of a language model, that consists of character n-grams yields significantly better recognition results. Our resulting system clearly outperforms commercial systems and leads to further error rate reductions compared to previous results reached on this database.

Anja Brakensiek, Daniel Willett, Gerhard Rigoll

Real-time Traffic

Character N-grams Yields | Computer Vision | Error Rate Reductions | ICPR 2000 | Multifont Character Recognition |

claim paper

Post Info
More Details (n/a)

Added	09 Nov 2009
Updated	09 Nov 2009
Type	Conference
Year	2000
Where	ICPR
Authors	Anja Brakensiek, Daniel Willett, Gerhard Rigoll

Comments (0)

Sciweavers

Improved Degraded Document Recognition with Hybrid Modeling Techniques and Character N-Grams

Character N-grams Yields | Computer Vision | Error Rate Reductions | ICPR 2000 | Multifont Character Recognition |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers