Sciweavers

EACL
2009
ACL Anthology

MINT: A Method for Effective and Scalable Mining of Named Entity Transliterations from Large Comparable Corpora

13 years 9 months ago
MINT: A Method for Effective and Scalable Mining of Named Entity Transliterations from Large Comparable Corpora
In this paper, we address the problem of mining transliterations of Named Entities (NEs) from large comparable corpora. We leverage the empirical fact that multilingual news articles with similar news content are rich in Named Entity Transliteration Equivalents (NETEs). Our mining algorithm, MINT, uses a cross-language document similarity model to align multilingual news articles and then mines NETEs from the aligned articles using a transliteration similarity model. We show that our approach is highly effective on 6 different comparable corpora between English and 4 languages from 3 different language families. Furthermore, it performs substantially better than a state-of-the-art competitor.
Raghavendra Udupa, K. Saravanan, A. Kumaran, Jagad
Added 17 Feb 2011
Updated 17 Feb 2011
Type Journal
Year 2009
Where EACL
Authors Raghavendra Udupa, K. Saravanan, A. Kumaran, Jagadeesh Jagarlamudi
Comments (0)