

Fixed Length Word Suffix for Factored Statistical Machine Translation

14 years 1 months ago
Fixed Length Word Suffix for Factored Statistical Machine Translation
Factored Statistical Machine Translation extends the Phrase Based SMT model by allowing each word to be a vector of factors. Experiments have shown effectiveness of many factors, including the Part of Speech tags in improving the grammaticality of the output. However, high quality part of speech taggers are not available in open domain for many languages. In this paper we used fixed length word suffix as a new factor in the Factored SMT, and were able to achieve significant improvements in three set of experiments: large NIST Arabic to English system, medium WMT Spanish to English system, and small TRANSTAC English to Iraqi system.
Narjes Sharif Razavian, Stephan Vogel
Added 10 Feb 2011
Updated 10 Feb 2011
Type Journal
Year 2010
Where ACL
Authors Narjes Sharif Razavian, Stephan Vogel
Comments (0)