We provide an analytical comparison between discounted and average reward temporal-difference (TD) learning with linearly parameterized approximations. We first consider the asympt...
We present a general approach to model selection and regularization that exploits unlabeled data to adaptively control hypothesis complexity in supervised learning tasks. The idea ...
I describe a framework for interpreting Support Vector Machines (SVMs) as maximum a posteriori (MAP) solutions to inference problems with Gaussian Process priors. This probabilisti...
We examine methods for constructing regression ensembles based on a linear program (LP). The ensemble regression function consists of linear combinations of base hypotheses generat...
This paper introduces a Bayesian method for clustering dynamic processes. The method models dynamics as Markov chains and then applies an agglomerative clustering procedure to disc...
The execution order of a block of computer instructions on a pipelined machine can make a difference in running time by a factor of two or more. Compilers use heuristic schedulers...
Most classification algorithms receive as input a set of attributes of the classified objects. In many cases, however, the supplied set of attributes is not sufficient for creatin...
We consider the existence of a linear weak learner for boosting algorithms. A weak learner for binary classification problems is required to achieve a weighted empirical error on t...
Abstract. The problemof state abstractionis of centralimportancein optimalcontrol,reinforcement learning and Markov decision processes. This paper studies the case of variable reso...