

A Fast Decoder for Joint Word Segmentation and POS-Tagging Using a Single Discriminative Model

14 years 27 days ago
A Fast Decoder for Joint Word Segmentation and POS-Tagging Using a Single Discriminative Model
We show that the standard beam-search algorithm can be used as an efficient decoder for the global linear model of Zhang and Clark (2008) for joint word segmentation and POS-tagging, achieving a significant speed improvement. Such decoding is enabled by: (1) separating full word features from partial word features so that feature templates can be instantiated incrementally, according to whether the current character is separated or appended; (2) deciding the POS-tag of a potential word when its first character is processed. Early-update is used with perceptron training so that the linear model gives a high score to a correct partial candidate as well as a full output. Effective scoring of partial structures allows the decoder to give high accuracy with a small beam-size of 16. In our 10-fold crossvalidation experiments with the Chinese Treebank, our system performed over 10 times as fast as Zhang and Clark (2008) with little accuracy loss. The accuracy of our system on the standard CT...
Yue Zhang 0004, Stephen Clark
Added 11 Feb 2011
Updated 11 Feb 2011
Type Journal
Year 2010
Authors Yue Zhang 0004, Stephen Clark
Comments (0)