A common assumption in supervised learning is that the training and test input points follow the same probability distribution. However, this assumption is not fulfilled, e.g., in...
We start by showing that in an active learning setting, the Perceptron algorithm needs Ω( 1 ε2 ) labels to learn linear separators within generalization error ε. We then prese...
Sanjoy Dasgupta, Adam Tauman Kalai, Claire Montele...
— We analyze the generalization performance of a student in a model composed of linear perceptrons: a true teacher, ensemble teachers, and the student. Calculating the generaliza...
— In this paper a Particle Swarm Optimization (PSO)-based training strategy is introduced for fuzzy ARTMAP that minimizes generalization error while optimizing parameter values. ...
Eric Granger, Philippe Henniges, Luiz S. Oliveira,...
The dependence of the classification error on the size of a bagging ensemble can be modeled within the framework of Monte Carlo theory for ensemble learning. These error curves ar...
We describe and analyze a new approach for feature ranking in the presence of categorical features with a large number of possible values. It is shown that popular ranking criteria...
— To obtain accurate modeling results, it is of primal importance to find optimal values for the hyperparameters in the Support Vector Regression (SVR) model. In general, we sea...
This is a survey of some theoretical results on boosting obtained from an analogous treatment of some regression and classi cation boosting algorithms. Some related papers include...
We examine the set covering machine when it uses data-dependent half-spaces for its set of features and bound its generalization error in terms of the number of training errors an...
Mario Marchand, Mohak Shah, John Shawe-Taylor, Mar...
An important theoretical tool in machine learning is the bias/variance decomposition of the generalization error. It was introduced for the mean square error in [3]. The bias/vari...