Learning Translation Boundaries for Phrase-Based Decoding

14 years 12 months ago

Download www1.i2r.a-star.edu.sg

Constrained decoding is of great importance not only for speed but also for translation quality. Previous efforts explore soft syntactic constraints which are based on constituent boundaries deduced from parse trees of the source language. We present a new framework to establish soft constraints based on a more natural alternative: translation boundary rather than constituent boundary. We propose simple classifiers to learn translation boundaries for any source sentences. The classifiers are trained directly on word-aligned corpus without using any additional resources. We report the accuracy of our translation boundary classifiers. We show that using constraints based on translation boundaries predicted by our classifiers achieves significant improvements over the baseline on large-scale Chinese-toEnglish translation experiments. The new constraints also significantly outperform constituent boundary based syntactic constrains.

Deyi Xiong, Min Zhang, Haizhou Li

Real-time Traffic