Using cognates to align sentences in bilingual corpora

14 years 27 days ago

Download reference.kfupm.edu.sa

In a recent paper, Gale and Church describe an inexpensive method for aligning bitext, based exclusively on sentence lengths [Gale and Church, 1991]. While this method produces surprisingly good results (a success rate around 96%), even better results are required to perform such tasks as the computer-assisted revision of translations. In this paper, we examine some of the weaknesses of Gale and Church’s program, and explain how just a small amount of linguistic knowledge would help to overcome these weaknesses. We discuss how cognates provide for a cheap and reasonably reliable source of linguistic knowledge. To illustrate this, we describe a modiﬁcation to the program in which the criterion is cognates rather than sentence lengths. Finally, we show how better and more efﬁcient results may be obtained by combining the two criteria — length and “cognateness”. Our method can be generalized to accommodate other sources of linguistic knowledge, and experimentation shows that ...

Michel Simard, George F. Foster, Pierre Isabelle

Real-time Traffic

CASCON 1993 | CASCON 2007 | Inexpensive Method | Linguistic Knowledge | Sentence Lengths |

claim paper

Post Info
More Details (n/a)

Added	02 Nov 2010
Updated	02 Nov 2010
Type	Conference
Year	1993
Where	CASCON
Authors	Michel Simard, George F. Foster, Pierre Isabelle

Comments (0)

Sciweavers

Using cognates to align sentences in bilingual corpora

CASCON 1993 | CASCON 2007 | Inexpensive Method | Linguistic Knowledge | Sentence Lengths |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers