We present a stochastic finite-state model for segmenting Chinese text into dictionary entries and productively derived words, and providing pronunciations for these words; the me...
Richard Sproat, Chilin Shih, William Gale, Nancy C...
This paper presents a discriminative pruning method of n-gram language model for Chinese word segmentation. To reduce the size of the language model that is used in a Chinese word...
This report describes the English-Chinese crosslanguage retrieval experiments at Berkeley for TREC-9 Cross-Language Information Retrieval track. We present a simple and effective ...
Adaptor grammars are a framework for expressing and performing inference over a variety of non-parametric linguistic models. These models currently provide state-of-the-art perfor...
In this paper, we present a discriminative word-character hybrid model for joint Chinese word segmentation and POS tagging. Our word-character hybrid model offers high performance...