A New Method to Improve Multi Font Farsi/Arabic Character Segmentation Results: Using Extra Classes of Some Character Combinatio

16 years 22 days ago

Download ce.sharif.edu

A new segmentation algorithm for multifont Farsi/Arabic texts based on conditional labeling of up and down contours was presented in [1]. A preprocessing technique was used to adjust the local base line for each subword. Adaptive base line, up and down contours and their curvatures were used to improve the segmentation results. The algorithm segments 97% of 22236 characters in 18 fonts correctly. However, finding the best way to receive high performance in the multifont case is challengeable. Different characteristics of each font are the reason. Here we propose an idea to consider some extra classes in the recognition stage. The extra classes will be some parts of characters or the combination of 2 or more characters causing most of errors in segmentation stage. These extra classes will be determined statistically. We have used a learn document of 4820 characters for 4 fonts. Segmentation result improves from 96.7% to 99.64%.

Mona Omidyeganeh, Reza Azmi, Kambiz Nayebi, Abbas

Real-time Traffic

Base Line | Extra Classes | MMM 2007 | Multifont Farsi/arabic Texts | Multimedia |

claim paper

Post Info
More Details (n/a)

Added	06 Jun 2010
Updated	06 Jun 2010
Type	Conference
Year	2007
Where	MMM
Authors	Mona Omidyeganeh, Reza Azmi, Kambiz Nayebi, Abbas Javadtalab

Comments (0)

Sciweavers

A New Method to Improve Multi Font Farsi/Arabic Character Segmentation Results: Using Extra Classes of Some Character Combinatio

Base Line | Extra Classes | MMM 2007 | Multifont Farsi/arabic Texts | Multimedia |

Explore & Download

Productivity Tools

Sciweavers