The growing body of DNA microarray data has the potential to advance our understanding of the molecular basis of disease. However annotating microarray datasets with clinically us...
In this paper we introduce the Generalized Bayesian Committee Machine (GBCM) for applications with large data sets. In particular, the GBCM can be used in the context of kernel ba...
In many application domains there is a large amount of unlabeled data but only a very limited amount of labeled training data. One general approach that has been explored for util...
Avrim Blum, John D. Lafferty, Mugizi Robert Rweban...
Decision-tree algorithms are known to be unstable: small variations in the training set can result in different trees and different predictions for the same validation examples. B...
When data collection is costly and/or takes a significant amount of time, an early prediction of the classifier performance is extremely important for the design of the data minin...