We describe here a novel scheme for recognition of online handwritten basic characters of Bangla, an Indian script used by more than 200 million people. There are 50 basic characters in Bangla and we have used a database of 24,500 online handwritten isolated character samples written by 70 persons. Samples in this database are composed of one or more strokes and we have collected all the strokes obtained from the training samples of the 50 character classes. These strokes are manually grouped into 54 classes based on the shape similarity of the graphemes that constitute the ideal character shapes. Strokes are recognized by using hidden Markov models (HMM). One HMM is constructed for each stroke class. A second stage of classification is used for recognition of characters using stroke classification results along with 50 lookup-tables (for 50 character classes).
Swapan K. Parui, Koushik Guin, Ujjwal Bhattacharya