Bayesian reinforcement learning in continuous POMDPs with gaussian processes

15 years 9 months ago

Download www.cs.cmu.edu

— Partially Observable Markov Decision Processes (POMDPs) provide a rich mathematical model to handle realworld sequential decision processes but require a known model to be solved by most approaches. However, mainstream POMDP research focuses on the discrete case and this complicates its application to most realistic problems that are naturally modeled using continuous state spaces. In this paper, we consider the problem of optimal control in continuous and partially observable environments when the parameters of the model are unknown. We advocate the use of Gaussian Process Dynamical Models (GPDMs) so that we can learn the model through experience with the environment. Our results on the blimp problem show that the approach can learn good models of the sensors and actuators in order to maximize long-term rewards.

Patrick Dallaire, Camille Besse, Stéphane R

Real-time Traffic

Decision Processes | IROS 2009 | Model | Observable Markov Decision | Robotics |

claim paper

» Gaussian Processes for Sample Efficient Reinforcement Learning with RMAXLike Exploration

» BayesAdaptive POMDPs

» Reinforcement learning with Gaussian processes

» Bayesian reinforcement learning for POMDPbased dialogue systems

» Gaussian Processes in Reinforcement Learning

» Bayesian update of dialogue state A POMDP framework for spoken dialogue systems

» Policy Gradient Critics

» Bayes Meets Bellman The Gaussian Process Approach to Temporal Difference Learning

Post Info
More Details (n/a)

Added	24 May 2010
Updated	24 May 2010
Type	Conference
Year	2009
Where	IROS
Authors	Patrick Dallaire, Camille Besse, Stéphane Ross, Brahim Chaib-draa

Comments (0)

Sciweavers

Bayesian reinforcement learning in continuous POMDPs with gaussian processes

Decision Processes | IROS 2009 | Model | Observable Markov Decision | Robotics |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers