Most machine learning algorithms are lazy: they extract from the training set the minimum information needed to predict its labels. Unfortunately, this often leads to models that ...
Joseph O'Sullivan, John Langford, Rich Caruana, Av...
We optimize the full-response diagnostic fault dictionary from a given test set. The smallest set of vectors is selected without loss of diagnostic resolution of the given test se...
—We address the problem of determining what size test set guarantees statistically significant results in a character recognition task, as a function of the expected error rate. ...
Isabelle Guyon, John Makhoul, Richard M. Schwartz,...
Background: Gene set enrichment testing has helped bridge the gap from an individual gene to a systems biology interpretation of microarray data. Although gene sets are defined a ...
The problem addressed in this paper is to predict a user's numeric rating in a product review from the text of the review. Unigram and n-gram representations of text are comm...