Sciweavers

31 search results - page 3 / 7
» Enhancing Chinese Word Segmentation Using Unlabeled Data
Sort
View
IJCNLP
2004
Springer
14 years 21 days ago
The Use of SVM for Chinese New Word Identification
We present a study of new word identification (NWI) to improve the performance of a Chinese word segmenter. In this paper the distribution and types of new words are discussed emp...
Hongqiao Li, Changning Huang, Jianfeng Gao, Xiaozh...
ACL
2006
13 years 8 months ago
Subword-Based Tagging for Confidence-Dependent Chinese Word Segmentation
We proposed a subword-based tagging for Chinese word segmentation to improve the existing character-based tagging. The subword-based tagging was implemented using the maximum entr...
Ruiqiang Zhang, Gen-ichiro Kikui, Eiichiro Sumita
ACL
2006
13 years 8 months ago
Unsupervised Segmentation of Chinese Text by Use of Branching Entropy
We propose an unsupervised segmentation method based on an assumption about language data: that the increasing point of entropy of successive characters is the location of a word ...
Zhihui Jin, Kumiko Tanaka-Ishii
NIPS
2008
13 years 8 months ago
Learning the Semantic Correlation: An Alternative Way to Gain from Unlabeled Text
In this paper, we address the question of what kind of knowledge is generally transferable from unlabeled text. We suggest and analyze the semantic correlation of words as a gener...
Yi Zhang 0010, Jeff Schneider, Artur Dubrawski
NLPRS
2001
Springer
13 years 11 months ago
Automatic Corpus-Based Extraction of Chinese Legal Terms
This paper reports on a study involving the automatic extraction of Chinese legal terms. We used a word segmented corpus of Chinese court judgments to extract salient legal expres...
Oi Yee Kwong, Benjamin K. Tsou