Sciweavers

564 search results - page 65 / 113
» Improving distant supervision using inference learning
Sort
View
JMLR
2010
192views more  JMLR 2010»
13 years 4 months ago
Inducing Tree-Substitution Grammars
Inducing a grammar from text has proven to be a notoriously challenging learning task despite decades of research. The primary reason for its difficulty is that in order to induce...
Trevor Cohn, Phil Blunsom, Sharon Goldwater
ICDAR
2011
IEEE
12 years 9 months ago
OCR-Driven Writer Identification and Adaptation in an HMM Handwriting Recognition System
—We present an OCR-driven writer identification algorithm in this paper. Our algorithm learns writer-specific characteristics more precisely from explicit character alignment usi...
Huaigu Cao, Rohit Prasad, Prem Natarajan
LREC
2008
88views Education» more  LREC 2008»
13 years 11 months ago
A Trainable Tokenizer, solution for multilingual texts and compound expression tokenization
Tokenization is one of the initial steps done for almost any text processing task. It is not particularly recognized as a challenging task for English monolingual systems but it r...
Oana Frunza
ML
2010
ACM
13 years 8 months ago
Semi-supervised local Fisher discriminant analysis for dimensionality reduction
When only a small number of labeled samples are available, supervised dimensionality reduction methods tend to perform poorly due to overfitting. In such cases, unlabeled samples ...
Masashi Sugiyama, Tsuyoshi Idé, Shinichi Na...
ICTIR
2009
Springer
13 years 7 months ago
Training Data Cleaning for Text Classification
Abstract. In text classification (TC) and other tasks involving supervised learning, labelled data may be scarce or expensive to obtain; strategies are thus needed for maximizing t...
Andrea Esuli, Fabrizio Sebastiani