Sciweavers

163 search results - page 19 / 33
» Policy Gradient Methods for Robotics
Sort
View
GECCO
2011
Springer
256views Optimization» more  GECCO 2011»
12 years 11 months ago
Evolving complete robots with CPPN-NEAT: the utility of recurrent connections
This paper extends prior work using Compositional Pattern Producing Networks (CPPNs) as a generative encoding for the purpose of simultaneously evolving robot morphology and contr...
Joshua E. Auerbach, Josh C. Bongard
PKDD
2009
Springer
181views Data Mining» more  PKDD 2009»
14 years 2 months ago
Active Learning for Reward Estimation in Inverse Reinforcement Learning
Abstract. Inverse reinforcement learning addresses the general problem of recovering a reward function from samples of a policy provided by an expert/demonstrator. In this paper, w...
Manuel Lopes, Francisco S. Melo, Luis Montesano
ICML
1995
IEEE
14 years 8 months ago
Learning Policies for Partially Observable Environments: Scaling Up
Partially observable Markov decision processes (pomdp's) model decision problems in which an agent tries to maximize its reward in the face of limited and/or noisy sensor fee...
Michael L. Littman, Anthony R. Cassandra, Leslie P...
ICRA
2009
IEEE
106views Robotics» more  ICRA 2009»
14 years 2 months ago
Stochastic strategies for a swarm robotic assembly system
— We present a decentralized, scalable approach to assembling a group of heterogeneous parts into different products using a swarm of robots. While the assembly plans are predete...
Loic Matthey, Spring Berman, Vijay Kumar
ICRA
2007
IEEE
155views Robotics» more  ICRA 2007»
14 years 2 months ago
Dogged Learning for Robots
— Ubiquitous robots need the ability to adapt their behaviour to the changing situations and demands they will encounter during their lifetimes. In particular, non-technical user...
Daniel H. Grollman, Odest Chadwicke Jenkins