Bayes-Adaptive POMDPs

15 years 8 months ago

Download books.nips.cc

Bayesian Reinforcement Learning has generated substantial interest recently, as it provides an elegant solution to the exploration-exploitation trade-off in reinforcement learning. However most investigations of Bayesian reinforcement learning to date focus on the standard Markov Decision Processes (MDPs). Our goal is to extend these ideas to the more general Partially Observable MDP (POMDP) framework, where the state is a hidden variable. To address this problem, we introduce a new mathematical model, the Bayes-Adaptive POMDP. This new model allows us to (1) improve knowledge of the POMDP domain through interaction with the environment, and (2) plan optimal sequences of actions which can tradeoff between improving the model, identifying the state, and gathering reward. We show how the model can be ﬁnitely approximated while preserving the value function. We describe approximations for belief tracking and planning in this model. Empirical results on two domains show that the model e...

Stéphane Ross, Brahim Chaib-draa, Joelle Pi

Real-time Traffic

Bayesian Reinforcement Learning | Information Technology | Model Estimates | NIPS 2007 | Standard Markov Decision |

claim paper

Post Info
More Details (n/a)

Added	30 Oct 2010
Updated	30 Oct 2010
Type	Conference
Year	2007
Where	NIPS
Authors	Stéphane Ross, Brahim Chaib-draa, Joelle Pineau

Comments (0)

Sciweavers

Bayes-Adaptive POMDPs

Bayesian Reinforcement Learning | Information Technology | Model Estimates | NIPS 2007 | Standard Markov Decision |

Explore & Download

Productivity Tools

Sciweavers