Sciweavers

91 search results - page 9 / 19
» Parameter-exploring policy gradients
Sort
View
JMLR
2006
143views more  JMLR 2006»
13 years 7 months ago
Geometric Variance Reduction in Markov Chains: Application to Value Function and Gradient Estimation
We study a sequential variance reduction technique for Monte Carlo estimation of functionals in Markov Chains. The method is based on designing sequential control variates using s...
Rémi Munos
ICRA
2010
IEEE
145views Robotics» more  ICRA 2010»
13 years 6 months ago
Reinforcement learning of motor skills in high dimensions: A path integral approach
— Reinforcement learning (RL) is one of the most general approaches to learning control. Its applicability to complex motor systems, however, has been largely impossible so far d...
Evangelos Theodorou, Jonas Buchli, Stefan Schaal
EWRL
2008
13 years 9 months ago
Policy Learning - A Unified Perspective with Applications in Robotics
Policy Learning approaches are among the best suited methods for high-dimensional, continuous control systems such as anthropomorphic robot arms and humanoid robots. In this paper,...
Jan Peters, Jens Kober, Duy Nguyen-Tuong
NIPS
2003
13 years 9 months ago
Bounded Finite State Controllers
We describe a new approximation algorithm for solving partially observable MDPs. Our bounded policy iteration approach searches through the space of bounded-size, stochastic fini...
Pascal Poupart, Craig Boutilier
ICRA
2010
IEEE
149views Robotics» more  ICRA 2010»
13 years 6 months ago
A simple learning strategy for high-speed quadrocopter multi-flips
— We describe a simple and intuitive policy gradient method for improving parametrized quadrocopter multi-flips by combining iterative experiments with information from a first...
Sergei Lupashin, Angela Schöllig, Michael She...