Sciweavers

ACL
2006

Contextual Dependencies in Unsupervised Word Segmentation

14 years 25 days ago
Contextual Dependencies in Unsupervised Word Segmentation
Developing better methods for segmenting continuous text into words is important for improving the processing of Asian languages, and may shed light on how humans learn to segment speech. We propose two new Bayesian word segmentation methods that assume unigram and bigram models of word dependencies respectively. The bigram model greatly outperforms the unigram model (and previous probabilistic models), demonstrating the importance of such dependencies for word segmentation. We also show that previous probabilistic models rely crucially on suboptimal search procedures.
Sharon Goldwater, Thomas L. Griffiths, Mark Johnso
Added 30 Oct 2010
Updated 30 Oct 2010
Type Conference
Year 2006
Where ACL
Authors Sharon Goldwater, Thomas L. Griffiths, Mark Johnson
Comments (0)