Sciweavers

RSS
2007

Active Policy Learning for Robot Planning and Exploration under Uncertainty

14 years 1 months ago
Active Policy Learning for Robot Planning and Exploration under Uncertainty
Abstract— This paper proposes a simulation-based active policy learning algorithm for finite-horizon, partially-observed sequential decision processes. The algorithm is tested in the domain of robot navigation and exploration under uncertainty, where the expected cost is a function of the belief state (filtering distribution). This filtering distribution is in turn nonlinear and subject to discontinuities, which arise because constraints in the robot motion and control models. As a result, the expected cost is non-differentiable and very expensive to simulate. The new algorithm overcomes the first difficulty and reduces the number of simulations as follows. First, it assumes that we have carried out previous evaluations of the expected cost for different corresponding policy parameters. Second, it fits a Gaussian process (GP) regression model to these values, so as to approximate the expected cost as a function of the policy parameters. Third, it uses the GP predicted mean and ...
Ruben Martinez-Cantin, Nando de Freitas, Arnaud Do
Added 30 Oct 2010
Updated 30 Oct 2010
Type Conference
Year 2007
Where RSS
Authors Ruben Martinez-Cantin, Nando de Freitas, Arnaud Doucet, José A. Castellanos
Comments (0)