The dominant practice of statistical machine translation (SMT) uses the same Chinese word segmentation specification in both alignment and translation rule induction steps in buil...
Ning Xi, Guangchao Tang, Xinyu Dai, Shujian Huang,...
Abstract—According to characteristics of Mongolian wordformation, a method for removing inflectional suffixes from word images of the Mongolian Kanjur is proposed in this paper. ...
This paper presents an Enhanced Heuristic Segmenter (EHS) and an improved neural-based segmentation technique for segmenting cursive words and validating prospective segmentation ...
We present an unsupervised word segmentation model for machine translation. The model uses existing monolingual segmentation techniques and models the joint distribution over sour...
Almost all Chinese language processing tasks involve word segmentation of the language input as their first steps, thus robust and reliable segmentation techniques are always requ...