Sciweavers

ACL
2012

Machine Translation without Words through Substring Alignment

12 years 1 months ago
Machine Translation without Words through Substring Alignment
In this paper, we demonstrate that accurate machine translation is possible without the concept of “words,” treating MT as a problem of transformation between character strings. We achieve this result by applying phrasal inversion transduction grammar alignment techniques to character strings to train a character-based translation model, and using this in the phrase-based MT framework. We also propose a look-ahead parsing algorithm and substring-informed prior probabilities to achieve more effective and efficient alignment. In an evaluation, we demonstrate that character-based translation can achieve results that compare to word-based systems while effectively translating unknown and uncommon words over several language pairs.
Graham Neubig, Taro Watanabe, Shinsuke Mori, Tatsu
Added 29 Sep 2012
Updated 29 Sep 2012
Type Journal
Year 2012
Where ACL
Authors Graham Neubig, Taro Watanabe, Shinsuke Mori, Tatsuya Kawahara
Comments (0)