Greek Polytonic OCR Based on Efficient Character Class Number Reduction

14 years 6 months ago

Download www.icdar2011.org

—Recognition of document images having Greek polytonic (multi accent) characters is a challenging task due the large number of existing character classes (more than 270). In this paper, we propose a novel OCR framework for the recognition of machine-printed Greek polytonic documents that is based on combining five different recognition modules in order to have a small number of classes (around 30) in each module. One recognition module is used for accent recognition while four recognition modules are used for the recognition of characters belonging to different horizontal text zones. The proposed system also includes the following stages: a) preprocessing, b) text dewarping, text line and text baseline detection, c) accent and character detection and d) combination of accent and character recognition results. Extended experiments have been conducted in order to record the performance of the proposed OCR system, of all involved recognition modules as well as of the accent detection st...

Basilios Gatos, Georgios Louloudis, Nikolaos Stama

Real-time Traffic

Accent Characters | Detection Stage | Document Analysis | ICDAR 2011 | Polytonic |

claim paper

Added	24 Dec 2011
Updated	24 Dec 2011
Type	Journal
Year	2011
Where	ICDAR
Authors	Basilios Gatos, Georgios Louloudis, Nikolaos Stamatopoulos

Sciweavers

Greek Polytonic OCR Based on Efficient Character Class Number Reduction

Accent Characters | Detection Stage | Document Analysis | ICDAR 2011 | Polytonic |

Explore & Download

Productivity Tools

Sciweavers