We present four new reinforcement learning algorithms based on actor-critic and natural-gradient ideas, and provide their convergence proofs. Actor-critic reinforcement learning m...
Shalabh Bhatnagar, Richard S. Sutton, Mohammad Gha...
We introduce relational temporal difference learning as an effective approach to solving multi-agent Markov decision problems with large state spaces. Our algorithm uses temporal ...
Positive transfer learning (TL) occurs when, after gaining experience from learning how to solve a (source) task, the same learner can exploit this experience to improve performanc...
Abstract— Groups of reinforcement learning agents interacting in a common environment often fail to learn optimal behaviors. Poor performance is particularly common in environmen...
\Ibots" (Integrating roBOTS) is a computer experiment in group learning. It is designed to understand how to use reinforcement learning to program automatically a team of robo...