Recently, we have introduced a novel approach to dynamic programming and reinforcement learning that is based on maintaining explicit representations of stationary distributions i...
Tao Wang, Daniel J. Lizotte, Michael H. Bowling, D...
Abstract. We investigate the problem of using function approximation in reinforcement learning where the agent’s policy is represented as a classifier mapping states to actions....
T ORDER REGRESSION (EXTENDED ABSTRACT) Kurt Driessensa Saso Dzeroskib a Department of Computer Science, University of Waikato, Hamilton, New Zealand (kurtd@waikato.ac.nz) b Departm...
We target the problem of closed-loop learning of control policies that map visual percepts to continuous actions. Our algorithm, called Reinforcement Learning of Joint Classes (RLJ...
Synchronous reinforcement learning (RL) algorithms with linear function approximation are representable as inhomogeneous matrix iterations of a special form (Schoknecht & Merk...