Sciweavers

91 search results - page 4 / 19
» Parameter-exploring policy gradients
Sort
View
CEC
2011
IEEE
14 years 5 months ago
Stochastic Natural Gradient Descent by estimation of empirical covariances
—Stochastic relaxation aims at finding the minimum of a fitness function by identifying a proper sequence of distributions, in a given model, that minimize the expected value o...
Luigi Malagò, Matteo Matteucci, Giovanni Pi...
NIPS
2008
15 years 7 months ago
Signal-to-Noise Ratio Analysis of Policy Gradient Algorithms
Policy gradient (PG) reinforcement learning algorithms have strong (local) convergence guarantees, but their learning performance is typically limited by a large variance in the e...
John W. Roberts, Russ Tedrake
IJCAI
2003
15 years 7 months ago
Covariant Policy Search
We investigate the problem of non-covariant behavior of policy gradient reinforcement learning algorithms. The policy gradient approach is amenable to analysis by information geom...
J. Andrew Bagnell, Jeff G. Schneider
AIPS
2007
15 years 7 months ago
Concurrent Probabilistic Temporal Planning with Policy-Gradients
We present an any-time concurrent probabilistic temporal planner that includes continuous and discrete uncertainties and metric functions. Our approach is a direct policy search t...
Douglas Aberdeen, Olivier Buffet
AAAI
2010
15 years 7 months ago
Relative Entropy Policy Search
Policy search is a successful approach to reinforcement learning. However, policy improvements often result in the loss of information. Hence, it has been marred by premature conv...
Jan Peters, Katharina Mülling, Yasemin Altun