Analysis of Japanese Compound Nouns by Direct Text Scanning

15 years 9 months ago

Download acl.ldc.upenn.edu

This paper aims to analyze word dependency structure in compound nouns appearing in Japanese newspaper articles. The analysis is a dil't:icult problem because such compound nouns can be quite long, have no word boundaries between contained nouns, and often contain nnregistered words such as abbreviations. The nonsegmentation property and unregistered words cause initial segmentation errors which result in erroneous analysis. This paper presents a corpus-based approach which scans a corpus with a set of pattern matchers and gathers cooccurrence examples to analyze compound nouns. It employs boot-strapping search to cope with unregistered words: if an unregistered word is lound in the process of searching the examples, it is recorded and invokes additional searches to gather the examples containing it. This makes it possible to correct initial oversegmentation errors, and leads to higher accuracy. The accuracy of the method is evaluated using the compound nouns of length 5, 6, 7, a...

Toru Hisamitsu, Yoshihiko Nitta

Real-time Traffic

COLING 1996 | COLING 2008 | Compound Nouns | Unregistered Word | Word Dependency Structure |

claim paper

Post Info
More Details (n/a)

Added	02 Nov 2010
Updated	02 Nov 2010
Type	Conference
Year	1996
Where	COLING
Authors	Toru Hisamitsu, Yoshihiko Nitta

Comments (0)

Sciweavers

Analysis of Japanese Compound Nouns by Direct Text Scanning

COLING 1996 | COLING 2008 | Compound Nouns | Unregistered Word | Word Dependency Structure |

Explore & Download

Productivity Tools

Sciweavers