Sciweavers

ANLP
2000

Automatic construction of parallel English-Chinese corpus for cross-language information retrieval

14 years 2 months ago
Automatic construction of parallel English-Chinese corpus for cross-language information retrieval
A major obstacle to the construction of a probabilistic translation model is the lack of large parallel corpora. In this paper we first describe a parallel text mining system that finds parallel texts automatically on the Web. The generated Chinese-English parallel corpus is used to train a probabilistic translation model which translates queries for Chinese-English cross-language information retrieval (CLIR). We will discuss some problems in translation model training and show the preliminary CUR results.
Jiang Chen, Jian-Yun Nie
Added 01 Nov 2010
Updated 01 Nov 2010
Type Conference
Year 2000
Where ANLP
Authors Jiang Chen, Jian-Yun Nie
Comments (0)