Sciweavers

NAACL
2007

High-Performance, Language-Independent Morphological Segmentation

14 years 26 days ago
High-Performance, Language-Independent Morphological Segmentation
This paper introduces an unsupervised morphological segmentation algorithm that shows robust performance for four languages with different levels of morphological complexity. In particular, our algorithm outperforms Goldsmith’s Linguistica and Creutz and Lagus’s Morphessor for English and Bengali, and achieves performance that is comparable to the best results for all three PASCAL evaluation datasets. Improvements arise from (1) the use of relative corpus frequency and suffix level similarity for detecting incorrect morpheme attachments and (2) the induction of orthographic rules and allomorphs for segmenting words where roots exhibit spelling changes during morpheme attachments.
Sajib Dasgupta, Vincent Ng
Added 30 Oct 2010
Updated 30 Oct 2010
Type Conference
Year 2007
Where NAACL
Authors Sajib Dasgupta, Vincent Ng
Comments (0)