Automatic construction of parallel English-Chinese corpus for cross-language information retrieval

15 years 8 months ago

Download acl.ldc.upenn.edu

A major obstacle to the construction of a probabilistic translation model is the lack of large parallel corpora. In this paper we first describe a parallel text mining system that finds parallel texts automatically on the Web. The generated Chinese-English parallel corpus is used to train a probabilistic translation model which translates queries for Chinese-English cross-language information retrieval (CLIR). We will discuss some problems in translation model training and show the preliminary CUR results.

Jiang Chen, Jian-Yun Nie

Real-time Traffic

ANLP 2000 | Generated Chinese-english Parallel | Parallel Texts | Probabilistic Translation Model |

claim paper

Related Content

» Crosslingual relevance models

» Bootstrapping dictionaries for crosslanguage information retrieval

Post Info
More Details (n/a)

Added	01 Nov 2010
Updated	01 Nov 2010
Type	Conference
Year	2000
Where	ANLP
Authors	Jiang Chen, Jian-Yun Nie

Comments (0)

Sciweavers

Automatic construction of parallel English-Chinese corpus for cross-language information retrieval

ANLP 2000 | Generated Chinese-english Parallel | Parallel Texts | Probabilistic Translation Model |

Explore & Download

Productivity Tools

Sciweavers