Binarization of Synchronous Context Free Grammars (SCFG) is essential for achieving polynomial time complexity of decoding for SCFG parsing based machine translation systems. In t...
Tong Xiao, Mu Li, Dongdong Zhang, Jingbo Zhu, Ming...
Training a statistical machine translation starts with tokenizing a parallel corpus. Some languages such as Chinese do not incorporate spacing in their writing system, which creat...
In this article we outline a basic approach to treating metonymy properly in a multilingual machine translation system. This is the first attempt at treating metonymy in an machin...
We propose a general method to watermark and probabilistically identify the structured outputs of machine learning algorithms. Our method is robust to local editing operations and...
Ashish Venugopal, Jakob Uszkoreit, David Talbot, F...
Abstract. For many applications such as machine translation and bilingual information retrieval, the bilingual corpora play an important role in training the system. Because they a...