Korean is an agglutinative language that does not have explicit word boundaries. It is also a highly inflective language that exhibits severe coarticulation effects. These charac...
Sakriani Sakti, Andrew M. Finch, Ryosuke Isotani, ...
In this paper, we present a two-stage approach to acquire Japanese unknown morphemes from text with full POS tags assigned to them. We first acquire unknown morphemes only making ...
In this paper we introduce a statistical Named Entity recognizer (NER) system for the Hungarian language. We examined three methods for identifying and disambiguating proper nouns...
We present a method for constructing, maintaining and consulting a database of proper nouns. We describe noun phrases composed of a proper noun and/or a description of a human occ...
This paper describes a classifier that assigns semantic thesaurus categories to unknown Chinese words (words not already in the CiLin thesaurus and the Chinese Electronic Dictiona...
We present the results of an agreement task carried out in the framework of the KNOW Project and consisting in manually annotating an agreement sample totaling 50 sentences extrac...
We present a method for learning to find English to Chinese transliterations on the Web. In our approach, proper nouns are expanded into new queries aimed at maximizing the probab...
Translating news articles frequently involves finding foreign language equivalents for proper nouns occurring for the first time in an original article, a time-consuming and labor...
The paper presents a data cleansing technique for string databases. We propose and evaluate an algorithm that identifies a group of strings that consists of (multiple) occurrence...