There has been a lot of recent work on Bayesian methods for reinforcement learning exhibiting near-optimal online performance. The main obstacle facing such methods is that in most...
Many algorithms such as Q-learning successfully address reinforcement learning in single-agent multi-time-step problems. In addition there are methods that address reinforcement l...
For many problems which would be natural for reinforcement learning, the reward signal is not a single scalar value but has multiple scalar components. Examples of such problems i...