Parallel data in the domain of interest is the key resource when training a statistical machine translation (SMT) system for a specific purpose. Since ad-hoc manual translation c...
Prasanth Kolachina, Nicola Cancedda, Marc Dymetman...
The world wide web is a natural setting for cross-lingual information retrieval. The European Union is a typical example of a multilingual scenario, where multiple users have to de...
In conventional word alignment methods, some employ statistical models or statistical measures, which need large-scale bilingual sentencealigned training corpora. Others employ dic...
Sparse coding--that is, modelling data vectors as sparse linear combinations of basis elements--is widely used in machine learning, neuroscience, signal processing, and statistics...
Julien Mairal, Francis Bach, Jean Ponce, Guillermo...
Statistical machine translation to morphologically richer languages is a challenging task and more so if the source and target languages differ in word order. Current state-of-the...