We show that states of a dynamical system can be usefully represented by multi-step, action-conditional predictions of future observations. State representations that are grounded...
Michael L. Littman, Richard S. Sutton, Satinder P....
We combine the replica approach from statistical physics with a variational approach to analyze learning curves analytically. We apply the method to Gaussian process regression. A...
We derive an equivalence between AdaBoost and the dual of a convex optimization problem, showing that the only difference between minimizing the exponential loss used by AdaBoost ...
With the increasing number of users of mobile computing devices (e.g. personal digital assistants) and the advent of third generation mobile phones, wireless communications are be...
Neil D. Lawrence, Antony I. T. Rowstron, Christoph...
We present a new approach to bounding the true error rate of a continuous valued classifier based upon PAC-Bayes bounds. The method first constructs a distribution over classifier...
When constructing a classifier, the probability of correct classification of future data points should be maximized. In the current paper this desideratum is translated in a very ...
Gert R. G. Lanckriet, Laurent El Ghaoui, Chiranjib...
We propose a new approach to reinforcement learning which combines least squares function approximation with policy iteration. Our method is model-free and completely off policy. ...
We give results about the learnability and required complexity of logical formulae to solve classification problems. These results are obtained by linking propositional logic with...
Adam Kowalczyk, Alex J. Smola, Robert C. Williamso...
Tangential hand velocity profiles of rapid human arm movements often appear as sequences of several bell-shaped acceleration-deceleration phases called submovements or movement un...