Sciweavers

1313 search results - page 26 / 263
» Intelligent Selection of Language Model Training Data
Sort
View
NAACL
2004
15 years 3 months ago
Name Tagging with Word Clusters and Discriminative Training
We present a technique for augmenting annotated training data with hierarchical word clusters that are automatically derived from a large unannotated corpus. Cluster membership is...
Scott Miller, Jethran Guinness, Alex Zamanian
137
Voted
ICTAI
2006
IEEE
15 years 8 months ago
Learning to Predict Salient Regions from Disjoint and Skewed Training Sets
We present an ensemble learning approach that achieves accurate predictions from arbitrarily partitioned data. The partitions come from the distributed processing requirements of ...
Larry Shoemaker, Robert E. Banfield, Lawrence O. H...
127
Voted
NAACL
2010
15 years 12 days ago
Language identification of names with SVMs
The task of identifying the language of text or utterances has a number of applications in natural language processing. Language identification has traditionally been approached w...
Aditya Bhargava, Grzegorz Kondrak
131
Voted
SDM
2010
SIAM
218views Data Mining» more  SDM 2010»
15 years 4 months ago
Confidence-Based Feature Acquisition to Minimize Training and Test Costs
We present Confidence-based Feature Acquisition (CFA), a novel supervised learning method for acquiring missing feature values when there is missing data at both training and test...
Marie desJardins, James MacGlashan, Kiri L. Wagsta...
ACL
2009
15 years 10 days ago
Stochastic Gradient Descent Training for L1-regularized Log-linear Models with Cumulative Penalty
Stochastic gradient descent (SGD) uses approximate gradients estimated from subsets of the training data and updates the parameters in an online fashion. This learning framework i...
Yoshimasa Tsuruoka, Jun-ichi Tsujii, Sophia Anania...