Inversion Transduction Grammar Constraints for Mining Parallel Sentences from Quasi-Comparable Corpora

16 years 1 months ago

Download www.cs.ust.hk

Abstract. We present a new implication of Wu’s (1997) Inversion Transduction Grammar (ITG) Hypothesis, on the problem of retrieving truly parallel sentence translations from large collections of highly non-parallel documents. Our approach leverages a strong language universal constraint posited by the ITG Hypothesis, that can serve as a strong inductive bias for various language learning problems, resulting in both eﬃciency and accuracy gains. The task we attack is highly practical since non-parallel multilingual data exists in far greater quantities than parallel corpora, but parallel sentences are a much more useful resource. Our aim here is to mine truly parallel sentences, as opposed to comparable sentence pairs or loose translations as in most previous work. The method we introduce exploits Bracketing ITGs to produce the ﬁrst known results for this problem. Experiments show that it obtains large accuracy gains on this task compared to the expected performance of state-of-the...

Dekai Wu, Pascale Fung

Real-time Traffic

Comparable Sentence Pairs | IJCNLP 2005 | Natural Language Processing | Parallel Sentence Translations | Parallel Sentences |

claim paper

Post Info
More Details (n/a)

Added	27 Jun 2010
Updated	27 Jun 2010
Type	Conference
Year	2005
Where	IJCNLP
Authors	Dekai Wu, Pascale Fung

Comments (0)

Sciweavers

Inversion Transduction Grammar Constraints for Mining Parallel Sentences from Quasi-Comparable Corpora

Comparable Sentence Pairs | IJCNLP 2005 | Natural Language Processing | Parallel Sentence Translations | Parallel Sentences |

Explore & Download

Productivity Tools

Sciweavers