Bilingual Word Embeddings from Non-Parallel Document-Aligned Data Applied to Bilingual Lexicon Induction

9 years 2 months ago

Download lirias.kuleuven.be

We propose a simple yet effective approach to learning bilingual word embeddings (BWEs) from non-parallel document-aligned data (based on the omnipresent skip-gram model), and its application to bilingual lexicon induction (BLI). We demonstrate the utility of the induced BWEs in the BLI task by reporting on benchmarking BLI datasets for three language pairs: (1) We show that our BWE-based BLI models signiﬁcantly outperform the MuPTM-based and context-counting models in this setting, and obtain the best reported BLI results for all three tested language pairs; (2) We also show that our BWE-based BLI models outperform other BLI models based on recently proposed BWEs that require parallel data for bilingual training.

Ivan Vulic, Marie-Francine Moens

Real-time Traffic

ACL 2015 | Computational Linguistics |

claim paper

Post Info
More Details (n/a)

Added	13 Apr 2016
Updated	13 Apr 2016
Type	Journal
Year	2015
Where	ACL
Authors	Ivan Vulic, Marie-Francine Moens

Comments (0)

Sciweavers

Bilingual Word Embeddings from Non-Parallel Document-Aligned Data Applied to Bilingual Lexicon Induction

ACL 2015 | Computational Linguistics |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers