Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

170

ACL
2006

149views Computational Linguistics» more ACL 2006»

Subword-Based Tagging for Confidence-Dependent Chinese Word Segmentation

15 years 7 months ago

Subword-Based Tagging for Confidence-Dependent Chinese Word Segmentation

Download acl.ldc.upenn.edu

We proposed a subword-based tagging for Chinese word segmentation to improve the existing character-based tagging. The subword-based tagging was implemented using the maximum entropy (MaxEnt) and the conditional random fields (CRF) methods. We found that the proposed subword-based tagging outperformed the character-based tagging in all comparative experiments. In addition, we proposed a confidence measure approach to combine the results of a dictionary-based and a subword-tagging-based segmentation. This approach can produce an ideal tradeoff between the in-vocaulary rate and out-of-vocabulary rate. Our techniques were evaluated using the test data from Sighan Bakeoff 2005. We achieved higher F-scores than the best results in three of the four corpora: PKU(0.951), CITYU(0.950) and MSR(0.971).

Ruiqiang Zhang, Gen-ichiro Kikui, Eiichiro Sumita

Real-time Traffic

ACL 2006 | ACL 2007 | Character-based Tagging | Chinese Word Segmentation | Subword-based Tagging |

claim paper

Related Content

» An ErrorDriven WordCharacter Hybrid Model for Joint Chinese Word Segmentation and POS Tagg...

» A Stacked SubWord Model for Joint Chinese Word Segmentation and PartofSpeech Tagging

» Automatic Adaptation of Annotation Standards Chinese Word Segmentation and POS Tagging A ...

» Incremental Joint Approach to Word Segmentation POS Tagging and Dependency Parsing in Chin...

» Chinese PartofSpeech Tagging OneataTime or AllatOnce WordBased or CharacterBased

» Word Lattice Reranking for Chinese Word Segmentation and PartofSpeech Tagging

» A LexiconConstrained Character Model for Chinese Morphological Analysis

» Combining Machine Learning with Linguistic Heuristics for Chinese Word Segmentation

» A Cascaded Linear Model for Joint Chinese Word Segmentation and PartofSpeech Tagging

Post Info
More Details (n/a)

Added	30 Oct 2010
Updated	30 Oct 2010
Type	Conference
Year	2006
Where	ACL
Authors	Ruiqiang Zhang, Gen-ichiro Kikui, Eiichiro Sumita

Comments (0)