Search Sciweavers | Sciweavers

91 search results - page 9 / 19

» Parameter-exploring policy gradients

154

click to vote

JMLR
2006

143views more JMLR 2006»

Geometric Variance Reduction in Markov Chains: Application to Value Function and Gradient Estimation

15 years 5 months ago

Download www.aaai.org

We study a sequential variance reduction technique for Monte Carlo estimation of functionals in Markov Chains. The method is based on designing sequential control variates using s...

Rémi Munos

claim paper

Read More »

169

click to vote

ICRA
2010
IEEE

145views Robotics» more ICRA 2010»

Reinforcement learning of motor skills in high dimensions: A path integral approach

15 years 4 months ago

Download www-personal.acfr.usyd.edu.au

— Reinforcement learning (RL) is one of the most general approaches to learning control. Its applicability to complex motor systems, however, has been largely impossible so far d...

Evangelos Theodorou, Jonas Buchli, Stefan Schaal

claim paper

Read More »

178

click to vote

EWRL
2008

148views Machine Learning» more EWRL 2008»

Policy Learning - A Unified Perspective with Applications in Robotics

15 years 7 months ago

Download www.kyb.tuebingen.mpg.de

Policy Learning approaches are among the best suited methods for high-dimensional, continuous control systems such as anthropomorphic robot arms and humanoid robots. In this paper,...

Jan Peters, Jens Kober, Duy Nguyen-Tuong

claim paper

Read More »

168

click to vote

NIPS
2003

180views Information Technology» more NIPS 2003»

Bounded Finite State Controllers

15 years 7 months ago

Download books.nips.cc

We describe a new approximation algorithm for solving partially observable MDPs. Our bounded policy iteration approach searches through the space of bounded-size, stochastic ﬁni...

Pascal Poupart, Craig Boutilier

claim paper

Read More »

118

click to vote

ICRA
2010
IEEE

149views Robotics» more ICRA 2010»

A simple learning strategy for high-speed quadrocopter multi-flips

15 years 4 months ago

Download www.idsc.ethz.ch

— We describe a simple and intuitive policy gradient method for improving parametrized quadrocopter multi-ﬂips by combining iterative experiments with information from a ﬁrst...

Sergei Lupashin, Angela Schöllig, Michael She...

claim paper

Read More »

« Prev « First page 9 / 19 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers