Sciweavers

FINTAL
2006

Statistical Machine Translation of German Compound Words

14 years 3 months ago
Statistical Machine Translation of German Compound Words
Abstract. German compound words pose special problems to statistical machine translation systems: the occurence of each of the components in the training data is not sufficient for successful translation. Even if the compound itself has been seen during training, the system may not be capable of translating it properly into two or more words. If German is the target language, the system might generate only separated components or may not be capable of choosing the correct compound. In this work, we investigate and compare different strategies for the treatment of German compound words in statistical machine translation systems. For translation from German, we compare linguistic-based and corpusbased compound splitting. For translation into German, we investigate splitting and rejoining German compounds, as well as joining English potential components. Additionaly, we investigate word alignments enhanced with knowledge about the splitting points of German compounds. The translation qual...
Maja Popovic, Daniel Stein, Hermann Ney
Added 22 Aug 2010
Updated 22 Aug 2010
Type Conference
Year 2006
Where FINTAL
Authors Maja Popovic, Daniel Stein, Hermann Ney
Comments (0)