

A Hybrid Markov/Semi-Markov Conditional Random Field for Sequence Segmentation

14 years 4 months ago
A Hybrid Markov/Semi-Markov Conditional Random Field for Sequence Segmentation
Markov order-1 conditional random fields (CRFs) and semi-Markov CRFs are two popular models for sequence segmentation and labeling. Both models have advantages in terms of the type of features they most naturally represent. We propose a hybrid model that is capable of representing both types of features, and describe efficient algorithms for its training and inference. We demonstrate that our hybrid model achieves error reductions of 18% and 25% over a standard order-1 CRF and a semi-Markov CRF (resp.) on the task of Chinese word segmentation. We also propose the use of a powerful feature for the semi-Markov CRF: the log conditional odds that a given token sequence constitutes a chunk according to a generative model, which reduces error by an additional 13%. Our best system achieves 96.8% F-measure, the highest reported score on this test set.
Galen Andrew
Added 30 Oct 2010
Updated 30 Oct 2010
Type Conference
Year 2006
Authors Galen Andrew
Comments (0)