Sciweavers

542 search results - page 32 / 109
» Learning author-topic models from text corpora
Sort
View
ML
2000
ACM
124views Machine Learning» more  ML 2000»
13 years 7 months ago
Text Classification from Labeled and Unlabeled Documents using EM
This paper shows that the accuracy of learned text classifiers can be improved by augmenting a small number of labeled training documents with a large pool of unlabeled documents. ...
Kamal Nigam, Andrew McCallum, Sebastian Thrun, Tom...
COLING
2010
13 years 2 months ago
Broad Coverage Multilingual Deep Sentence Generation with a Stochastic Multi-Level Realizer
Most of the known stochastic sentence generators use syntactically annotated corpora, performing the projection to the surface in one stage. However, in full-fledged text generati...
Bernd Bohnet, Leo Wanner, Simon Mille, Alicia Burg...
INTERSPEECH
2010
13 years 2 months ago
Learning a language model from continuous speech
This paper presents a new approach to language model construction, learning a language model not from text, but directly from continuous speech. A phoneme lattice is created using...
Graham Neubig, Masato Mimura, Shinsuke Mori, Tatsu...
CVPR
2011
IEEE
13 years 3 months ago
Enforcing Similarity Constraints with Integer Programming for Better Scene Text Recognition
The recognition of text in everyday scenes is made difficult by viewing conditions, unusual fonts, and lack of linguistic context. Most methods integrate a priori appearance info...
David Smith, Jacqueline Feild, Eric Learned-Miller
LREC
2008
108views Education» more  LREC 2008»
13 years 9 months ago
A Lightweight and Efficient Tool for Cleaning Web Pages
Originally conceived as a "naive" baseline experiment using traditional n-gram language models as classifiers, the NCLEANER system has turned out to be a fast and lightw...
Stefan Evert