This paper addresses the task of handling unknown terms in SMT. We propose using source-language monolingual models and resources to paraphrase the source text prior to translatio...
In this paper, we propose a new Bayesian model for fully unsupervised word segmentation and an efficient blocked Gibbs sampler combined with dynamic programming for inference. Our...
Efficient processing of tera-scale text data is an important research topic. This paper proposes lossless compression of Ngram language models based on LOUDS, a succinct data stru...
In this article, we introduce the ongoing research in modeling and mapping for heterogeneous, customizable, parallel systems, as part of the effort in the newly established Center...
Randomised techniques allow very big language models to be represented succinctly. However, being batch-based they are unsuitable for modelling an unbounded stream of language whi...