In this paper, we propose a novel Chinese word segmentation method which leverages the huge deposit of Web documents and search technology. It simultaneously solves ambiguous phra...
We proposed a subword-based tagging for Chinese word segmentation to improve the existing character-based tagging. The subword-based tagging was implemented using the maximum entr...
We propose the first joint model for word segmentation, POS tagging, and dependency parsing for Chinese. Based on an extension of the incremental joint model for POS tagging and ...
Jun Hatori, Takuya Matsuzaki, Yusuke Miyao, Jun-ic...
This paper describes a hybrid model that combines machine learning with linguistic heuristics for integrating unknown word identification with Chinese word segmentation. The model...
Abstract. We propose a new unsupervised training method for acquiring probability models that accurately segment Chinese character sequences into words. By constructing a core lexi...