In this paper, we introduce a generative probabilistic optical character recognition (OCR) model that describes an end-to-end process in the noisy channel framework, progressing f...
This paper investigates bootstrapping for statistical parsers to reduce their reliance on manually annotated training data. We consider both a mostly-unsupervised approach, co-tra...
Mark Steedman, Rebecca Hwa, Stephen Clark, Miles O...
Homograph ambiguity is an original issue in Text-to-Speech (TTS). To disambiguate homograph, several efficient approaches have been proposed such as part-of-speech (POS) n-gram, B...
We present a new part-of-speech tagger that demonstrates the following ideas: (i) explicit use of both preceding and following tag contexts via a dependency network representation...
Kristina Toutanova, Dan Klein, Christopher D. Mann...
This demonstration involves two-way automatic speechto-speech translation on a consumer off-the-shelf PDA. This work was done as part of the DARPA-funded Babylon project, investig...
Alex Waibel, Ahmed Badran, Alan W. Black, Robert E...