Sciweavers

COLING
2008

Linguistically Annotated BTG for Statistical Machine Translation

14 years 1 months ago
Linguistically Annotated BTG for Statistical Machine Translation
Bracketing Transduction Grammar (BTG) is a natural choice for effective integration of desired linguistic knowledge into statistical machine translation (SMT). In this paper, we propose a Linguistically Annotated BTG (LABTG) for SMT. It conveys linguistic knowledge of source-side syntax structures to BTG hierarchical structures through linguistic annotation. From the linguistically annotated data, we learn annotated BTG rules and train linguistically motivated phrase translation model and reordering model. We also present an annotation algorithm that captures syntactic information for BTG nodes. The experiments show that the LABTG approach significantly outperforms a baseline BTGbased system and a state-of-the-art phrasebased system on the NIST MT-05 Chineseto-English translation task. Moreover, we empirically demonstrate that the proposed method achieves better translation selection and phrase reordering.
Deyi Xiong, Min Zhang, AiTi Aw, Haizhou Li
Added 29 Oct 2010
Updated 29 Oct 2010
Type Conference
Year 2008
Where COLING
Authors Deyi Xiong, Min Zhang, AiTi Aw, Haizhou Li
Comments (0)