Inducing Sentence Structure from Parallel Corpora for Reordering

13 years 1 months ago

Download www.denero.org

When translating among languages that differ substantially in word order, machine translation (MT) systems beneﬁt from syntactic preordering—an approach that uses features from a syntactic parse to permute source words into a target-language-like order. This paper presents a method for inducing parse trees automatically from a parallel corpus, instead of using a supervised parser trained on a treebank. These induced parses are used to preorder source sentences. We demonstrate that our induced parser is effective: it not only improves a state-of-the-art phrase-based system with integrated reordering, but also approaches the performance of a recent preordering method based on a supervised parser. These results show that the syntactic structure which is relevant to MT pre-ordering can be learned automatically from parallel text, thus establishing a new application for unsupervised grammar induction.

John DeNero, Jakob Uszkoreit

Real-time Traffic

EMNLP 2011 | Grammar Induction | Natural Language Processing | Parallel Corpus | Target Language |

claim paper

Post Info
More Details (n/a)

Added	20 Dec 2011
Updated	20 Dec 2011
Type	Journal
Year	2011
Where	EMNLP
Authors	John DeNero, Jakob Uszkoreit

Comments (0)

Sciweavers

Inducing Sentence Structure from Parallel Corpora for Reordering

EMNLP 2011 | Grammar Induction | Natural Language Processing | Parallel Corpus | Target Language |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers