Sciweavers

163 search results - page 10 / 33
» Policy Gradient Methods for Robotics
Sort
View
ICANNGA
2007
Springer
105views Algorithms» more  ICANNGA 2007»
14 years 2 months ago
Reinforcement Learning in Fine Time Discretization
Reinforcement Learning (RL) is analyzed here as a tool for control system optimization. State and action spaces are assumed to be continuous. Time is assumed to be discrete, yet th...
Pawel Wawrzynski
IROS
2006
IEEE
187views Robotics» more  IROS 2006»
14 years 1 months ago
Fast and Stable Learning of Quasi-Passive Dynamic Walking by an Unstable Biped Robot based on Off-Policy Natural Actor-Critic
— Recently, many researchers on humanoid robotics are interested in Quasi-Passive-Dynamic Walking (Quasi-PDW) which is similar to human walking. It is desirable that control para...
Tsuyoshi Ueno, Yutaka Nakamura, Takashi Takuma, To...
ICRA
2009
IEEE
143views Robotics» more  ICRA 2009»
14 years 2 months ago
Least absolute policy iteration for robust value function approximation
Abstract— Least-squares policy iteration is a useful reinforcement learning method in robotics due to its computational efficiency. However, it tends to be sensitive to outliers...
Masashi Sugiyama, Hirotaka Hachiya, Hisashi Kashim...
ICRA
2010
IEEE
145views Robotics» more  ICRA 2010»
13 years 6 months ago
Reinforcement learning of motor skills in high dimensions: A path integral approach
— Reinforcement learning (RL) is one of the most general approaches to learning control. Its applicability to complex motor systems, however, has been largely impossible so far d...
Evangelos Theodorou, Jonas Buchli, Stefan Schaal
NIPS
1990
13 years 9 months ago
Planning with an Adaptive World Model
We present a new connectionist planning method TML90 . By interaction with an unknown environment, a world model is progressively constructed using gradient descent. For deriving ...
Sebastian Thrun, Knut Möller, Alexander Linde...