Sciweavers

473 search results - page 9 / 95
» Optimal policy switching algorithms for reinforcement learni...
Sort
View
NIPS
2008
13 years 9 months ago
Signal-to-Noise Ratio Analysis of Policy Gradient Algorithms
Policy gradient (PG) reinforcement learning algorithms have strong (local) convergence guarantees, but their learning performance is typically limited by a large variance in the e...
John W. Roberts, Russ Tedrake
JMLR
2010
148views more  JMLR 2010»
13 years 2 months ago
A Generalized Path Integral Control Approach to Reinforcement Learning
With the goal to generate more scalable algorithms with higher efficiency and fewer open parameters, reinforcement learning (RL) has recently moved towards combining classical tec...
Evangelos Theodorou, Jonas Buchli, Stefan Schaal
UAI
2008
13 years 9 months ago
CORL: A Continuous-state Offset-dynamics Reinforcement Learner
Continuous state spaces and stochastic, switching dynamics characterize a number of rich, realworld domains, such as robot navigation across varying terrain. We describe a reinfor...
Emma Brunskill, Bethany R. Leffler, Lihong Li, Mic...
ML
2000
ACM
133views Machine Learning» more  ML 2000»
13 years 7 months ago
Convergence Results for Single-Step On-Policy Reinforcement-Learning Algorithms
Satinder P. Singh, Tommi Jaakkola, Michael L. Litt...
ICML
1994
IEEE
13 years 11 months ago
Markov Games as a Framework for Multi-Agent Reinforcement Learning
In the Markov decision process (MDP) formalization of reinforcement learning, a single adaptive agent interacts with an environment defined by a probabilistic transition function....
Michael L. Littman