Unsupervised Discovery of Persian Morphemes

15 years 8 months ago

Download acl.ldc.upenn.edu

This paper reports the present results of a research on unsupervised Persian morpheme discovery. In this paper we present a method for discovering the morphemes of Persian language through automatic analysis of corpora. We utilized a Minimum Description Length (MDL) based algorithm with some improvements and applied it to Persian corpus. Our improvements include enhancing the cost function using some heuristics, preventing the split of high frequency chunks, exploiting penalty for first and last letters and distinguishing pre-parts and post-parts. Our improved approach has raised the precision, recall and f-measure of discovery by respectively %32, %17 and %23.

Mohsen Arabsorkhi, Mehrnoush Shamsfard

Real-time Traffic

EACL 2006 | Minimum Description Length | Natural Language Processing | Persian Language | Unsupervised Persian Morpheme |

claim paper

» Unsupervised Discovery of Morphemes

» Unsupervised Multilingual Learning for Morphological Segmentation

» Unsupervised discovery of morphologically related words based on orthographic and semantic...

Post Info
More Details (n/a)

Added	30 Oct 2010
Updated	30 Oct 2010
Type	Conference
Year	2006
Where	EACL
Authors	Mohsen Arabsorkhi, Mehrnoush Shamsfard

Comments (0)

Sciweavers

Unsupervised Discovery of Persian Morphemes

EACL 2006 | Minimum Description Length | Natural Language Processing | Persian Language | Unsupervised Persian Morpheme |

Explore & Download

Productivity Tools

Sciweavers