Sciweavers

ACL
2011

An Algorithm for Unsupervised Transliteration Mining with an Application to Word Alignment

13 years 4 months ago
An Algorithm for Unsupervised Transliteration Mining with an Application to Word Alignment
We propose a language-independent method for the automatic extraction of transliteration pairs from parallel corpora. In contrast to previous work, our method uses no form of supervision, and does not require linguistically informed preprocessing. We conduct experiments on data sets from the NEWS 2010 shared task on transliteration mining and achieve an F-measure of up to 92%, outperforming most of the semi-supervised systems that were submitted. We also apply our method to English/Hindi and English/Arabic parallel corpora and compare the results with manually built gold standards which mark transliterated word pairs. Finally, we integrate the transliteration module into the GIZA++ word aligner and evaluate it on two word alignment tasks achieving improvements in both precision and recall measured against gold standard word alignments.
Hassan Sajjad, Alexander Fraser, Helmut Schmid
Added 23 Aug 2011
Updated 23 Aug 2011
Type Journal
Year 2011
Where ACL
Authors Hassan Sajjad, Alexander Fraser, Helmut Schmid
Comments (0)