Transliterated Named Entity Recognition Based on Chinese Word Sketch

15 years 6 months ago

Download cwn.ling.sinica.edu.tw

One of the unique challenges to Chinese Language Processing is cross-strait named entity recognition. Due to the adoption of different transliteration strategies, foreign name transliterations can vary greatly between PRC and Taiwan. This situation poses a serious problem for NLP tasks: including data mining, translation and information retrieval. In this paper, we introduce a novel approach to automatic extraction of divergent transliterations of foreign named entities by bootstrapping co-occurrence statistics from tagged Chinese corpora. In this study, we use Chinese Word Sketch The automatically bootstrapped transliteration pairs are further screened based on phonetic similarity. The precision is evaluated to be more than 90% against manually corrected transliteration pairs.

Petr Simon, Chu-Ren Huang, Shu-Kai Hsieh, Jia-Fei

Real-time Traffic

Chinese Language Processing | Foreign Named Entities | IJCPOL 2008 | Transliteration Pairs |

claim paper

Post Info
More Details (n/a)

Added	12 Dec 2010
Updated	12 Dec 2010
Type	Journal
Year	2008
Where	IJCPOL
Authors	Petr Simon, Chu-Ren Huang, Shu-Kai Hsieh, Jia-Fei Hong

Comments (0)

Sciweavers

Transliterated Named Entity Recognition Based on Chinese Word Sketch

Chinese Language Processing | Foreign Named Entities | IJCPOL 2008 | Transliteration Pairs |

Explore & Download

Productivity Tools

Sciweavers