Reinforcement Learning and the Bayesian Control Rule

14 years 10 months ago

Download metatip.com

We present an actor-critic scheme for reinforcement learning in complex domains. The main contribution is to show that planning and I/O dynamics can be separated such that an intractable planning problem reduces to a simple multi-armed bandit problem, where each lever stands for a potentially arbitrarily complex policy. Furthermore, we use the Bayesian control rule to construct an adaptive bandit player that is universal with respect to a given class of optimal bandit players, thus indirectly constructing an adaptive agent that is universal with respect to a given class of policies.

Pedro Alejandro Ortega, Daniel Alexander Braun, Si

Real-time Traffic

AGI 2011 | Artificial Intelligence | Bandit | Planning Problem | Reinforcement |

claim paper

» Bayesian Learning of Phrasal TreetoString Templates

» Bayesian reinforcement learning in continuous POMDPs with application to robot navigation

» Efficient methods for nearoptimal sequential decision making under uncertainty

» Covariant Policy Search

» Bayesian reinforcement learning in continuous POMDPs with gaussian processes

» Cognitive Agents Integrating Rules and Reinforcement Learning for ContextAware Decision Su...

» A Minimum Relative Entropy Principle for Learning and Acting

» Reinforcement Learning Hierarchical NeuroFuzzy Politree Model for Control of Autonomous Ag...

Post Info
More Details (n/a)

Added	24 Aug 2011
Updated	24 Aug 2011
Type	Journal
Year	2011
Where	AGI
Authors	Pedro Alejandro Ortega, Daniel Alexander Braun, Simon J. Godsill

Comments (0)

Sciweavers

Reinforcement Learning and the Bayesian Control Rule

AGI 2011 | Artificial Intelligence | Bandit | Planning Problem | Reinforcement |

Explore & Download

Productivity Tools

Sciweavers