With the development of the Internet environments, more and more language services become accessible for common people. However, the gap between human translators and machine tran...
The pipeline of most Phrase-Based Statistical Machine Translation (PB-SMT) systems starts from automatically word aligned parallel corpus. But word appears to be too fine-grained ...
Performance of n-gram language models depends to a large extent on the amount of training text material available for building the models and the degree to which this text matches...
Statistical bilingual word alignment has been well studied in the context of machine translation. This paper adapts the bilingual word alignment algorithm to monolingual scenario ...
In Chinese texts, words composed of single or multiple characters are not separated by spaces, unlike most western languages. Therefore Chinese word segmentation is considered an ...