We target the problem of closed-loop learning of control policies that map visual percepts to continuous actions. Our algorithm, called Reinforcement Learning of Joint Classes (RLJ...
We present four new reinforcement learning algorithms based on actor-critic and natural-gradient ideas, and provide their convergence proofs. Actor-critic reinforcement learning m...
Shalabh Bhatnagar, Richard S. Sutton, Mohammad Gha...
We introduce relational temporal difference learning as an effective approach to solving multi-agent Markov decision problems with large state spaces. Our algorithm uses temporal ...
Positive transfer learning (TL) occurs when, after gaining experience from learning how to solve a (source) task, the same learner can exploit this experience to improve performanc...
\Ibots" (Integrating roBOTS) is a computer experiment in group learning. It is designed to understand how to use reinforcement learning to program automatically a team of robo...