Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

163

ICDAR
2005
IEEE

131views Document Analysis» more ICDAR 2005»

Recognition of Printed Amharic Documents

16 years 17 days ago

Recognition of Printed Amharic Documents

Download cvit.iiit.ac.in

In Africa, there are a number of languages with their own indigenous scripts. This paper presents an OCR for Amharic scripts. Amharic is the official and working language of Ethiopia. This is possibly the first attempt towards the development of an OCR system for Amharic. Research in the recognition of Amharic script faces major challenges due to (i) the use of more than 300 characters in writing and (ii) existence of a large set of visually similar characters. In this paper, we propose a two-stage feature extraction scheme using PCA and LDA, followed by a decision DAG classifier with SVMs as the nodes. Recognition results are presented to demonstrate the performance on the various printing variations (fonts, styles and sizes) and real-life degraded documents such as books, magazines and newspapers.

Million Meshesha, C. V. Jawahar

Real-time Traffic

Amharic Script | Decision Dag Classifier | Document Analysis | ICDAR 2005 | Two-stage Feature Extraction |

claim paper

Related Content

» Lexiconbased offline recognition of Amharic words in unconstrained handwritten text

» Matching word images for contentbased retrieval from printed document images

» HMMBased Handwritten Amharic Word Recognition with Feature Concatenation

» Italic or Roman Word Style Recognition without A Priori Knowledge for Old Printed Document...

» PreProcessing of Degraded Printed Documents by Nonlocal Means and Total Variation

» Character Enhancement for Historical Newspapers Printed Using Hot Metal Typesetting

» An Efficient Staff Removal Approach from Printed Musical Documents

» Printer Modeling for Document Imaging

» Structural Features for Recognizing Degraded Printed Gurmukhi Script

Post Info
More Details (n/a)

Added	24 Jun 2010
Updated	24 Jun 2010
Type	Conference
Year	2005
Where	ICDAR
Authors	Million Meshesha, C. V. Jawahar

Comments (0)