Active Policy Learning for Robot Planning and Exploration under Uncertainty

14 years 3 months ago

Download www.roboticsproceedings.org

Abstract— This paper proposes a simulation-based active policy learning algorithm for ﬁnite-horizon, partially-observed sequential decision processes. The algorithm is tested in the domain of robot navigation and exploration under uncertainty, where the expected cost is a function of the belief state (ﬁltering distribution). This ﬁltering distribution is in turn nonlinear and subject to discontinuities, which arise because constraints in the robot motion and control models. As a result, the expected cost is non-differentiable and very expensive to simulate. The new algorithm overcomes the ﬁrst difﬁculty and reduces the number of simulations as follows. First, it assumes that we have carried out previous evaluations of the expected cost for different corresponding policy parameters. Second, it ﬁts a Gaussian process (GP) regression model to these values, so as to approximate the expected cost as a function of the policy parameters. Third, it uses the GP predicted mean and ...

Ruben Martinez-Cantin, Nando de Freitas, Arnaud Do

Real-time Traffic

Corresponding Policy Parameters | Policy Learning Algorithm | Policy Parameters | Robotics | RSS 2007 |

claim paper

Post Info
More Details (n/a)

Added	30 Oct 2010
Updated	30 Oct 2010
Type	Conference
Year	2007
Where	RSS
Authors	Ruben Martinez-Cantin, Nando de Freitas, Arnaud Doucet, José A. Castellanos

Comments (0)

Sciweavers

Active Policy Learning for Robot Planning and Exploration under Uncertainty

Corresponding Policy Parameters | Policy Learning Algorithm | Policy Parameters | Robotics | RSS 2007 |

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers