We present an algorithm called Optimistic Linear Programming (OLP) for learning to optimize average reward in an irreducible but otherwise unknown Markov decision process (MDP). O...
In this paper we focus on an interpretation of Gaussian radial basis functions (GRBF) which motivates extensions and learning strategies. Specifically, we show that GRBF regressio...
We use reinforcement learning (RL) to compute strategies for multiagent soccer teams. RL may pro t signi cantly from world models (WMs) estimating state transition probabilities an...
In the context of binary classification, we define disagreement as a measure of how often two independently-trained models differ in their classification of unlabeled data. We exp...
This work deals with a new technique for the estimation of the parameters and number of components in a finite mixture model. The learning procedure is performed by means of a expe...