We propose a language-independent approach for improving statistical machine translation for morphologically rich languages using a hybrid morpheme-word representation where the b...
In this work1 we obtain robust category-based language models to be integrated into speech recognition systems. Deductive rules are used to select linguistic categories and to matc...
We propose a language model based on a precise, linguistically motivated grammar (a hand-crafted Head-driven Phrase Structure Grammar) and a statistical model estimating the proba...
This paper proposes an OCR post-processing approach based on multi-knowledge, which integrates language knowledge and candidate distance information given by the OCR engine. In thi...
Statistical topic models provide a general data-driven framework for automated discovery of high-level knowledge from large collections of text documents. While topic models can p...
Chaitanya Chemudugunta, Padhraic Smyth, Mark Steyv...