Sciweavers

EMNLP
2007

Unsupervised Part-of-Speech Acquisition for Resource-Scarce Languages

14 years 1 months ago
Unsupervised Part-of-Speech Acquisition for Resource-Scarce Languages
This paper proposes a new bootstrapping approach to unsupervised part-of-speech induction. In comparison to previous bootstrapping algorithms developed for this problem, our approach aims to improve the quality of the seed clusters by employing seed words that are both distributionally and morphologically reliable. In particular, we present a novel method for combining morphological and distributional information for seed selection. Experimental results demonstrate that our approach works well for English and Bengali, thus providing suggestive evidence that it is applicable to both morphologically impoverished languages and highly inflectional languages.
Sajib Dasgupta, Vincent Ng
Added 29 Oct 2010
Updated 29 Oct 2010
Type Conference
Year 2007
Where EMNLP
Authors Sajib Dasgupta, Vincent Ng
Comments (0)