We present a new, unique and freely available parallel corpus containing European Union (EU) documents of mostly legal nature. It is available in all 20 official EU languages, wit...
Ralf Steinberger, Bruno Pouliquen, Anna Widiger, C...
The correct web site text content must be help to the visitors to find what they are looking for. However, the reality is quite different, many times the web page text content is a...
When building a new spoken dialogue application, large amounts of domain specific data are required. This paper addresses the issue of generating in-domain training data when litt...
We propose a language-independent approach for improving statistical machine translation for morphologically rich languages using a hybrid morpheme-word representation where the b...
We present a new phrase-based conditional exponential family translation model for statistical machine translation. The model operates on a feature representation in which sentenc...