Sciweavers

ICPR
2008
IEEE

Word-wise Sinhala Tamil and English script identification using Gaussian kernel SVM

14 years 6 months ago
Word-wise Sinhala Tamil and English script identification using Gaussian kernel SVM
There are many documents in Srilanka where a single document page may contain Sinhala, Tamil and English texts. For OCR development of such a document page, it is better to identify different scripts present in the page and then feed the identified portion to the respective OCR module. In this paper, a SVM based technique is proposed for word-wise identification of Sinhala, Tamil and English scripts from a single document page. Structural features, topological features and water reservoir principle based features are mainly used here for the purpose. From the experiment we obtained encouraging results.
Sukalpa Chanda, Srikanta Pal, Umapada Pal
Added 30 May 2010
Updated 30 May 2010
Type Conference
Year 2008
Where ICPR
Authors Sukalpa Chanda, Srikanta Pal, Umapada Pal
Comments (0)