Chinese input is one of the key challenges for Chinese PC users. This paper proposes a statistical approach to Pinyin-based Chinese input. This approach uses a trigram-based langu...
We proposed a subword-based tagging for Chinese word segmentation to improve the existing character-based tagging. The subword-based tagging was implemented using the maximum entr...
Research on linear text segmentation has been an on-going focus in NLP for the last decade, and it has great potential for a wide range of applications such as document summarizati...
Jingbo Zhu, Na Ye, Xinzhi Chang, Wenliang Chen, Be...
Word segmentation is the most critical pre-processing step for any handwritten document recognition/retrieval system. This paper describes an approach to separate a line of uncons...
This paper presents a Chinese word segmentation system that uses improved sourcechannel models of Chinese sentence generation. Chinese words are defined as one of the following fo...