Words in Chinese text are not naturally separated by delimiters, which poses a challenge to standard machine translation (MT) systems. In MT, the widely used approach is to apply ...
Jia Xu, Jianfeng Gao, Kristina Toutanova, Hermann ...
Abstract. Term extraction is an important problem in natural language processing. In this paper, we propose a language independent statistical corpus-based term extraction algorith...
We present a study of new word identification (NWI) to improve the performance of a Chinese word segmenter. In this paper the distribution and types of new words are discussed emp...
We propose new methods to exploit contemporaneous text, such as on-line news articles, to improve language models for automatic speech recognition and other natural language proce...