In this paper, we present a hybrid method for word segmentation and POS tagging. The target languages are those in which word boundaries are ambiguous, such as Chinese and Japanes...
The dominant practice of statistical machine translation (SMT) uses the same Chinese word segmentation specification in both alignment and translation rule induction steps in buil...
Ning Xi, Guangchao Tang, Xinyu Dai, Shujian Huang,...
This paper describes a set of computer programs for Chinese corpus analysis. These programs include (1) extraction of different characters, bigrams and words; (2) word segmentatio...
We present a study of new word identification (NWI) to improve the performance of a Chinese word segmenter. In this paper the distribution and types of new words are discussed emp...