Forest-based Translation Rule Extraction

14 years 1 months ago

Download www.cis.upenn.edu

Translation rule extraction is a fundamental problem in machine translation, especially for linguistically syntax-based systems that need parse trees from either or both sides of the bitext. The current dominant practice only uses 1-best trees, which adversely affects the rule set quality due to parsing errors. So we propose a novel approach which extracts rules from a packed forest that compactly encodes exponentially many parses. Experiments show that this method improves translation quality by over 1 BLEU point on a state-of-the-art tree-to-string system, and is 0.5 points better than (and twice as fast as) extracting on 30best parses. When combined with our previous work on forest-based decoding, it achieves a 2.5 BLEU points improvement over the baseline, and even outperforms the hierarchical system of Hiero by 0.7 points.

Haitao Mi, Liang Huang

Real-time Traffic

Bleu Points | EMNLP 2008 | Natural Language Processing | Rule Set Quality | Translation Rule Extraction |

claim paper

Post Info
More Details (n/a)

Added	29 Oct 2010
Updated	29 Oct 2010
Type	Conference
Year	2008
Where	EMNLP
Authors	Haitao Mi, Liang Huang

Comments (0)

Sciweavers

Forest-based Translation Rule Extraction

Bleu Points | EMNLP 2008 | Natural Language Processing | Rule Set Quality | Translation Rule Extraction |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers