Japanese Named Entity Extraction with Redundant Morphological Analysis

15 years 8 months ago

Download acl.ldc.upenn.edu

Named Entity (NE) extraction is an important subtask of document processing such as information extraction and question answering. A typical method used for NE extraction of Japanese texts is a cascade of morphological analysis, POS tagging and chunking. However, there are some cases where segmentation granularity contradicts the results of morphological analysis and the building units of NEs, so that extraction of some NEs are inherently impossible in this setting. To cope with the unit problem, we propose a character-based chunking method. Firstly, the input sentence is analyzed redundantly by a statistical morphological analyzer to produce multiple (n-best) answers. Then, each character is annotated with its character types and its possible POS tags of the top n-best answers. Finally, a support vector machine-based chunker picks up some portions of the input sentence as NEs. This method introduces richer information to the chunker than previous methods that base on a single morphol...

Masayuki Asahara, Yuji Matsumoto

Real-time Traffic

Input Sentence | Morphological Analysis | NAACL 2003 | NAACL 2007 | NE Extraction |

claim paper

Added	31 Oct 2010
Updated	31 Oct 2010
Type	Conference
Year	2003
Where	NAACL
Authors	Masayuki Asahara, Yuji Matsumoto

Sciweavers

Japanese Named Entity Extraction with Redundant Morphological Analysis

Input Sentence | Morphological Analysis | NAACL 2003 | NAACL 2007 | NE Extraction |

Explore & Download

Productivity Tools

Sciweavers