Search Sciweavers | Sciweavers

181 search results - page 6 / 37

» On Policy Learning in Restricted Policy Spaces

click to vote

IROS
2008
IEEE

144views Robotics» more IROS 2008»

Learning nonparametric policies by imitation

14 years 2 months ago

Download www.cs.washington.edu

— A long cherished goal in artiﬁcial intelligence has been the ability to endow a robot with the capacity to learn and generalize skills from watching a human teacher. Such an ...

David B. Grimes, Rajesh P. N. Rao

claim paper

Read More »

click to vote

ALT
2011
Springer

259views Machine Learning» more ALT 2011»

Deviations of Stochastic Bandit Regret

12 years 7 months ago

Download certis.enpc.fr

This paper studies the deviations of the regret in a stochastic multi-armed bandit problem. When the total number of plays n is known beforehand by the agent, Audibert et al. (2009...

Antoine Salomon, Jean-Yves Audibert

claim paper

Read More »

click to vote

ICML
2003
IEEE

151views Machine Learning» more ICML 2003»

Hierarchical Policy Gradient Algorithms

14 years 8 months ago

Download www.hpl.hp.com

Hierarchical reinforcement learning is a general framework which attempts to accelerate policy learning in large domains. On the other hand, policy gradient reinforcement learning...

Mohammad Ghavamzadeh, Sridhar Mahadevan

claim paper

Read More »

click to vote

ICRA
2005
IEEE

159views Robotics» more ICRA 2005»

Learning Sensory Feedback to CPG with Policy Gradient for Biped Locomotion

14 years 1 months ago

Download www.cns.atr.jp

— This paper proposes a learning framework for a CPG-based biped locomotion controller using a policy gradient method. Our goal in this study is to develop an efﬁcient learning...

Takamitsu Matsubara, Jun Morimoto, Jun Nakanishi, ...

claim paper

Read More »

click to vote

NIPS
2008

165views Information Technology» more NIPS 2008»

Regularized Policy Iteration

13 years 9 months ago

Download webdocs.cs.ualberta.ca

In this paper we consider approximate policy-iteration-based reinforcement learning algorithms. In order to implement a flexible function approximation scheme we propose the use o...

Amir Massoud Farahmand, Mohammad Ghavamzadeh, Csab...

claim paper

Read More »

« Prev « First page 6 / 37 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers