Sciweavers

COLING
2010
13 years 7 months ago
Word-based and Character-based Word Segmentation Models: Comparison and Combination
We present a theoretical and empirical comparative analysis of the two dominant categories of approaches in Chinese word segmentation: word-based models and character-based models...
Weiwei Sun
ACL
2009
13 years 10 months ago
An Error-Driven Word-Character Hybrid Model for Joint Chinese Word Segmentation and POS Tagging
In this paper, we present a discriminative word-character hybrid model for joint Chinese word segmentation and POS tagging. Our word-character hybrid model offers high performance...
Canasai Kruengkrai, Kiyotaka Uchimoto, Jun'ichi Ka...
ACL
2009
13 years 10 months ago
Automatic Adaptation of Annotation Standards: Chinese Word Segmentation and POS Tagging - A Case Study
Manually annotated corpora are valuable but scarce resources, yet for many annotation tasks such as treebanking and sequence labeling there exist multiple corpora with different a...
Wenbin Jiang, Liang Huang, Qun Liu
ACL
2003
14 years 2 months ago
Improved Source-Channel Models for Chinese Word Segmentation
This paper presents a Chinese word segmentation system that uses improved sourcechannel models of Chinese sentence generation. Chinese words are defined as one of the following fo...
Jianfeng Gao, Mu Li, Changning Huang
ACL
2006
14 years 2 months ago
Subword-Based Tagging for Confidence-Dependent Chinese Word Segmentation
We proposed a subword-based tagging for Chinese word segmentation to improve the existing character-based tagging. The subword-based tagging was implemented using the maximum entr...
Ruiqiang Zhang, Gen-ichiro Kikui, Eiichiro Sumita
ACL
2006
14 years 2 months ago
Discriminative Pruning of Language Models for Chinese Word Segmentation
This paper presents a discriminative pruning method of n-gram language model for Chinese word segmentation. To reduce the size of the language model that is used in a Chinese word...
Jianfeng Li, Haifeng Wang, Dengjun Ren, Guohua Li
ACL
2004
14 years 2 months ago
Adaptive Chinese Word Segmentation
This paper presents a Chinese word segmentation system which can adapt to different domains and standards. We first present a statistical framework where domain-specific words are...
Jianfeng Gao, Andi Wu, Cheng-Ning Huang, Hongqiao ...
LREC
2010
195views Education» more  LREC 2010»
14 years 2 months ago
Adapting Chinese Word Segmentation for Machine Translation Based on Short Units
In Chinese texts, words composed of single or multiple characters are not separated by spaces, unlike most western languages. Therefore Chinese word segmentation is considered an ...
Yiou Wang, Kiyotaka Uchimoto, Jun'ichi Kazama, Can...
LREC
2010
188views Education» more  LREC 2010»
14 years 2 months ago
How Large a Corpus Do We Need: Statistical Method Versus Rule-based Method
We investigate the impact of input data scale in corpus-based learning using a study style of Zipf's law. In our research, Chinese word segmentation is chosen as the study ca...
Hai Zhao, Yan Song, Chunyu Kit
ACL
2007
14 years 2 months ago
Chinese Segmentation with a Word-Based Perceptron Algorithm
Standard approaches to Chinese word segmentation treat the problem as a tagging task, assigning labels to the characters in the sequence indicating whether the character marks a w...
Yue Zhang 0004, Stephen Clark