

Chunking Using Conditional Random Fields in Korean Texts

14 years 8 months ago
Chunking Using Conditional Random Fields in Korean Texts
We present a method of chunking in Korean texts using conditional random fields (CRFs), a recently introduced probabilistic model for labeling and segmenting sequence of data. In agglutinative languages such as Korean and Japanese, a rule-based chunking method is predominantly used for its simplicity and efficiency. A hybrid of a rule-based and machine learning method was also proposed to handle exceptional cases of the rules. In this paper, we present how CRFs can be applied to the task of chunking in Korean texts. Experiments using the STEP 2000 dataset show that the proposed method significantly improves the performance as well as outperforms previous systems.
Yong-Hun Lee, Mi-Young Kim, Jong-Hyeok Lee
Added 27 Jun 2010
Updated 27 Jun 2010
Type Conference
Year 2005
Authors Yong-Hun Lee, Mi-Young Kim, Jong-Hyeok Lee
Comments (0)