In this paper, we propose a novel Chinese word segmentation method which leverages the huge deposit of Web documents and search technology. It simultaneously solves ambiguous phra...
Word segmentation is the first and obligatory task for every NLP. For inflectional languages like English, French, Dutch,.. their word boundaries are simply assumed to be whitespa...
There are two main topics in this paper: (i) Vietnamese words are recognized and sentences are segmented into words by using probabilistic models; (ii) the optimum probabilistic mo...
Segmentation of handwritten text in Gurmukhi script is an uphill task primarily because of the structural features of the script and varied writing styles. The presence of a horiz...
We present a stochastic finite-state model for segmenting Chinese text into dictionary entries and productively derived words, and providing pronunciations for these words; the me...
Richard Sproat, Chilin Shih, William Gale, Nancy C...