Sciweavers

ACL
2008

Learning Bilingual Lexicons from Monolingual Corpora

14 years 1 months ago
Learning Bilingual Lexicons from Monolingual Corpora
We present a method for learning bilingual translation lexicons from monolingual corpora. Word types in each language are characterized by purely monolingual features, such as context counts and orthographic substrings. Translations are induced using a generative model based on canonical correlation analysis, which explains the monolingual lexicons in terms of latent matchings. We show that high-precision lexicons can be learned in a variety of language pairs and from a range of corpus types.
Aria Haghighi, Percy Liang, Taylor Berg-Kirkpatric
Added 29 Oct 2010
Updated 29 Oct 2010
Type Conference
Year 2008
Where ACL
Authors Aria Haghighi, Percy Liang, Taylor Berg-Kirkpatrick, Dan Klein
Comments (0)