We show that linear value-function approximation is equivalent to a form of linear model approximation. We then derive a relationship between the model-approximation error and the...
Ronald Parr, Lihong Li, Gavin Taylor, Christopher ...
Research in reinforcementlearning (RL)has thus far concentrated on two optimality criteria: the discounted framework, which has been very well-studied, and the averagereward frame...
Potential-based shaping was designed as a way of introducing background knowledge into model-free reinforcement-learning algorithms. By identifying states that are likely to have ...
We provide a provably efficient algorithm for learning Markov Decision Processes (MDPs) with continuous state and action spaces in the online setting. Specifically, we take a mo...
We introduce new, efficient algorithms for value iteration with multiple reward functions and continuous state. We also give an algorithm for finding the set of all nondominated a...
Daniel J. Lizotte, Michael H. Bowling, Susan A. Mu...