Sciweavers

EACL
2006
ACL Anthology

Unsupervised Discovery of Persian Morphemes

14 years 27 days ago
Unsupervised Discovery of Persian Morphemes
This paper reports the present results of a research on unsupervised Persian morpheme discovery. In this paper we present a method for discovering the morphemes of Persian language through automatic analysis of corpora. We utilized a Minimum Description Length (MDL) based algorithm with some improvements and applied it to Persian corpus. Our improvements include enhancing the cost function using some heuristics, preventing the split of high frequency chunks, exploiting penalty for first and last letters and distinguishing pre-parts and post-parts. Our improved approach has raised the precision, recall and f-measure of discovery by respectively %32, %17 and %23.
Mohsen Arabsorkhi, Mehrnoush Shamsfard
Added 30 Oct 2010
Updated 30 Oct 2010
Type Conference
Year 2006
Where EACL
Authors Mohsen Arabsorkhi, Mehrnoush Shamsfard
Comments (0)