Sciweavers

91 search results - page 10 / 19
» Parameter-exploring policy gradients
Sort
View
ICML
2000
IEEE
14 years 8 months ago
Reinforcement Learning in POMDP's via Direct Gradient Ascent
This paper discusses theoretical and experimental aspects of gradient-based approaches to the direct optimization of policy performance in controlled ??? ?s. We introduce ??? ?, a...
Jonathan Baxter, Peter L. Bartlett
IOR
2011
107views more  IOR 2011»
13 years 2 months ago
Information Collection on a Graph
We derive a knowledge gradient policy for an optimal learning problem on a graph, in which we use sequential measurements to refine Bayesian estimates of individual edge values i...
Ilya O. Ryzhov, Warren B. Powell
NIPS
2007
13 years 9 months ago
Incremental Natural Actor-Critic Algorithms
We present four new reinforcement learning algorithms based on actor-critic and natural-gradient ideas, and provide their convergence proofs. Actor-critic reinforcement learning m...
Shalabh Bhatnagar, Richard S. Sutton, Mohammad Gha...
GECCO
2009
Springer
162views Optimization» more  GECCO 2009»
13 years 5 months ago
Uncertainty handling CMA-ES for reinforcement learning
The covariance matrix adaptation evolution strategy (CMAES) has proven to be a powerful method for reinforcement learning (RL). Recently, the CMA-ES has been augmented with an ada...
Verena Heidrich-Meisner, Christian Igel
INFOCOM
1995
IEEE
13 years 11 months ago
Complexity of Gradient Projection Method for Optimal Routing in Data Networks
—In this paper, we derive a time-complexity bound for the gradient projection method for optimal routing in data networks. This result shows that the gradient projection algorith...
Wei Kang Tsai, John K. Antonio, Garng M. Huang