Sciweavers

EMNLP
2008

Unsupervised Multilingual Learning for POS Tagging

14 years 1 months ago
Unsupervised Multilingual Learning for POS Tagging
We demonstrate the effectiveness of multilingual learning for unsupervised part-of-speech tagging. The key hypothesis of multilingual learning is that by combining cues from multiple languages, the structure of each becomes more apparent. We formulate a hierarchical Bayesian model for jointly predicting bilingual streams of part-of-speech tags. The model learns language-specific features while capturing cross-lingual patterns in tag distribution for aligned words. Once the parameters of our model have been learned on bilingual parallel data, we evaluate its performance on a held-out monolingual test set. Our evaluation on six pairs of languages shows consistent and significant performance gains over a state-of-the-art monolingual baseline. For one language pair, we observe a relative reduction in error of 53%.
Benjamin Snyder, Tahira Naseem, Jacob Eisenstein,
Added 29 Oct 2010
Updated 29 Oct 2010
Type Conference
Year 2008
Where EMNLP
Authors Benjamin Snyder, Tahira Naseem, Jacob Eisenstein, Regina Barzilay
Comments (0)