Statistical language models should improve as the size of the n-grams increases from 3 to 5 or higher. However, the number of parameters and calculations, and the storage requirem...
Le Quan Ha, Philip Hanna, Darryl Stewart, F. Jack ...
This paper proposes a new method for automatic acquisition of Chinese bracketing knowledge from English-Chinese sentencealigned bilingual corpora. Bilingual sentence pairs are fir...
A major obstacle to the construction of a probabilistic translation model is the lack of large parallel corpora. In this paper we first describe a parallel text mining system that...
This paper presents a general platform, namely synchronous tree sequence substitution grammar (STSSG), for the grammar comparison study in Translational Equivalence Modeling (TEM)...
Min Zhang, Hongfei Jiang, Haizhou Li, AiTi Aw, She...
We investigate the task of unsupervised constituency parsing from bilingual parallel corpora. Our goal is to use bilingual cues to learn improved parsing models for each language ...