Sciweavers

801 search results - page 50 / 161
» The Inefficiency of Batch Training for Large Training Sets
Sort
View
GECCO
2008
Springer
184views Optimization» more  GECCO 2008»
13 years 10 months ago
Analysis of mammography reports using maximum variation sampling
A genetic algorithm (GA) was developed to implement a maximum variation sampling technique to derive a subset of data from a large dataset of unstructured mammography reports. It ...
Robert M. Patton, Barbara G. Beckerman, Thomas E. ...
ICML
2005
IEEE
14 years 9 months ago
Fast condensed nearest neighbor rule
We present a novel algorithm for computing a training set consistent subset for the nearest neighbor decision rule. The algorithm, called FCNN rule, has some desirable properties....
Fabrizio Angiulli
BIBE
2007
IEEE
120views Bioinformatics» more  BIBE 2007»
14 years 25 days ago
Quality Assessment of Affymetrix GeneChip Data using the EM Algorithm and a Naive Bayes Classifier
Recent research has demonstrated the utility of using supervised classification systems for automatic identification of low quality microarray data. However, this approach requires...
Brian E. Howard, Beate Sick, Imara Perera, Yang Ju...
ACL
2004
13 years 10 months ago
Discriminative Language Modeling with Conditional Random Fields and the Perceptron Algorithm
This paper describes discriminative language modeling for a large vocabulary speech recognition task. We contrast two parameter estimation methods: the perceptron algorithm, and a...
Brian Roark, Murat Saraclar, Michael Collins, Mark...
CLEF
2011
Springer
12 years 8 months ago
Author Identification Using Semi-supervised Learning - Notebook for PAN at CLEF 2011
Author identification models fall into two major categories according to the way they handle the training texts: profile-based models produce one representation per author while in...
Ioannis Kourtis, Efstathios Stamatatos