Search Sciweavers | Sciweavers

173

RAS
2006

105views more RAS 2006»

Reinforcement learning for quasi-passive dynamic walking of an unstable biped robot

15 years 6 months ago

A class of biped locomotion called Passive Dynamic Walking (PDW) has been recognized to be efficient in energy consumption and a key to understand human walking. Although PDW is s...

Kentarou Hitomi, Tomohiro Shibata, Yutaka Nakamura...

claim paper

Read More »

207

click to vote

Publication

352views

Efficient methods for near-optimal sequential decision making under uncertainty

16 years 2 months ago

Download fias.uni-frankfurt.de

This chapter discusses decision making under uncertainty. More specifically, it offers an overview of efficient Bayesian and distribution-free algorithms for making near-optimal se...

Christos Dimitrakakis

posted by olethros

Read More »

167

click to vote

WSC
2008

154views Modeling And Simulation» more WSC 2008»

On step sizes, stochastic shortest paths, and survival probabilities in Reinforcement Learning

15 years 8 months ago

Download www.informs-sim.org

Reinforcement Learning (RL) is a simulation-based technique useful in solving Markov decision processes if their transition probabilities are not easily obtainable or if the probl...

Abhijit Gosavi

claim paper

Read More »

171

click to vote

NECO
2010

97views more NECO 2010»

Derivatives of Logarithmic Stationary Distributions for Policy Gradient Reinforcement Learning

15 years 4 months ago

Download www.kyb.tuebingen.mpg.de

Most conventional Policy Gradient Reinforcement Learning (PGRL) algorithms neglect (or do not explicitly make use of) a term in the average reward gradient with respect to the pol...

Tetsuro Morimura, Eiji Uchibe, Junichiro Yoshimoto...

claim paper

Read More »

168

click to vote

ICRA
2009
IEEE

143views Robotics» more ICRA 2009»

Least absolute policy iteration for robust value function approximation

16 years 1 months ago

Download sugiyama-www.cs.titech.ac.jp

Abstract— Least-squares policy iteration is a useful reinforcement learning method in robotics due to its computational efﬁciency. However, it tends to be sensitive to outliers...

Masashi Sugiyama, Hirotaka Hachiya, Hisashi Kashim...

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers