Sciweavers

ACL
2015

What's in a Domain? Analyzing Genre and Topic Differences in Statistical Machine Translation

8 years 7 months ago
What's in a Domain? Analyzing Genre and Topic Differences in Statistical Machine Translation
Domain adaptation is an active field of research in statistical machine translation (SMT), but so far most work has ignored the distinction between the topic and genre of documents. In this paper we quantify and disentangle the impact of genre and topic differences on translation quality by introducing a new data set that has controlled topic and genre distributions. In addition, we perform a detailed analysis showing that differences across topics only explain to a limited degree translation performance differences across genres, and that genre-specific errors are more attributable to model coverage than to suboptimal scoring of translation candidates.
Marlies van der Wees, Arianna Bisazza, Wouter Weer
Added 13 Apr 2016
Updated 13 Apr 2016
Type Journal
Year 2015
Where ACL
Authors Marlies van der Wees, Arianna Bisazza, Wouter Weerkamp, Christof Monz
Comments (0)