Sciweavers

IJCPOL
2008

Transliterated Named Entity Recognition Based on Chinese Word Sketch

13 years 11 months ago
Transliterated Named Entity Recognition Based on Chinese Word Sketch
One of the unique challenges to Chinese Language Processing is cross-strait named entity recognition. Due to the adoption of different transliteration strategies, foreign name transliterations can vary greatly between PRC and Taiwan. This situation poses a serious problem for NLP tasks: including data mining, translation and information retrieval. In this paper, we introduce a novel approach to automatic extraction of divergent transliterations of foreign named entities by bootstrapping co-occurrence statistics from tagged Chinese corpora. In this study, we use Chinese Word Sketch The automatically bootstrapped transliteration pairs are further screened based on phonetic similarity. The precision is evaluated to be more than 90% against manually corrected transliteration pairs.
Petr Simon, Chu-Ren Huang, Shu-Kai Hsieh, Jia-Fei
Added 12 Dec 2010
Updated 12 Dec 2010
Type Journal
Year 2008
Where IJCPOL
Authors Petr Simon, Chu-Ren Huang, Shu-Kai Hsieh, Jia-Fei Hong
Comments (0)