This paper shows that the accuracy of learned text classifiers can be improved by augmenting a small number of labeled training documents with a large pool of unlabeled documents. ...
Kamal Nigam, Andrew McCallum, Sebastian Thrun, Tom...
Most of the known stochastic sentence generators use syntactically annotated corpora, performing the projection to the surface in one stage. However, in full-fledged text generati...
Bernd Bohnet, Leo Wanner, Simon Mille, Alicia Burg...
This paper presents a new approach to language model construction, learning a language model not from text, but directly from continuous speech. A phoneme lattice is created using...
Graham Neubig, Masato Mimura, Shinsuke Mori, Tatsu...
The recognition of text in everyday scenes is made difficult by viewing conditions, unusual fonts, and lack of linguistic context. Most methods integrate a priori appearance info...
David Smith, Jacqueline Feild, Eric Learned-Miller
Originally conceived as a "naive" baseline experiment using traditional n-gram language models as classifiers, the NCLEANER system has turned out to be a fast and lightw...