Tree Exploration for Bayesian RL Exploration

16 years 1 months ago

Download arxiv.org

Research in reinforcement learning has produced algorithms for optimal decision making under uncertainty that fall within two main types. The ﬁrst employs a Bayesian framework, where optimality improves with increased computational time. This is because the resulting planning task takes the form of a dynamic programming problem on a belief tree with an inﬁnite number of states. The second type employs relatively simple algorithm which are shown to suffer small regret within a distribution-free framework. This paper presents a lower bound and a high probability upper bound on the optimal value function for the nodes in the Bayesian belief tree, which are analogous to similar bounds in POMDPs. The bounds are then used to create more efﬁcient strategies for exploring the tree. The resulting algorithms are compared with the distribution-free algorithm UCB1, as well as a simpler baseline algorithm on multiarmed bandit problems.

Christos Dimitrakakis

Real-time Traffic

Algorithm | Bayesian Belief Tree | Belief Tree | CIMCA 2008 | Internet Technology |

posted by olethros

» Gaussian Processes for Sample Efficient Reinforcement Learning with RMAXLike Exploration

» An analytic solution to discrete Bayesian reinforcement learning

» A Bayesian Approach to Imitation in Reinforcement Learning

» Exploring Localization in Bayesian Networks for Large Expert Systems

» Coordination in multiagent reinforcement learning a Bayesian approach

» Using Linear Programming for Bayesian Exploration in Markov Decision Processes

» Parameter space exploration with Gaussian process trees

» Generalized model learning for Reinforcement Learning on a humanoid robot

Post Info
More Details (+)

Added	29 May 2010
Updated	15 Dec 2011
Type	Conference
Year	2008
Where	CIMCA
Authors	Christos Dimitrakakis

corrected version

Comments (0)

	Complexity of Stochastic Branch and Bound Methods for Belief Tree Search in Bayesian Reinforcement Learning 509 views
	Reid et al.'s Distance Bounding Protocol and Mafia Fraud Attacks over Noisy Channels 545 views
	Rollout Sampling Approximate Policy Iteration 334 views
	Bayesian variable order Markov models. 404 views
	Statistical Decision Making for Authentication and Intrusion Detection 634 views

Sciweavers

Tree Exploration for Bayesian RL Exploration

Algorithm | Bayesian Belief Tree | Belief Tree | CIMCA 2008 | Internet Technology |

Explore & Download

Productivity Tools

Sciweavers