A number of reinforcement learning algorithms have been developed that are guaranteed to converge to the optimal solution when used with lookup tables. It is shown, however, that ...
— This paper proposes a high-level Reinforcement Learning (RL) control system for solving the action selection problem of an autonomous robot. Although the dominant approach, whe...
We introduce the first algorithm for off-policy temporal-difference learning that is stable with linear function approximation. Off-policy learning is of interest because it forms...
In many practical reinforcement learning problems, the state space is too large to permit an exact representation of the value function, much less the time required to compute it. ...
Several researchers have recently investigated the connection between reinforcement learning and classification. We are motivated by proposals of approximate policy iteration schem...