Search Sciweavers | Sciweavers

2 search results - page 1 / 1

» Derivatives of Logarithmic Stationary Distributions for Poli...

185

click to vote

NECO
2010

97views more NECO 2010»

Derivatives of Logarithmic Stationary Distributions for Policy Gradient Reinforcement Learning

15 years 5 months ago

Download www.kyb.tuebingen.mpg.de

Most conventional Policy Gradient Reinforcement Learning (PGRL) algorithms neglect (or do not explicitly make use of) a term in the average reward gradient with respect to the pol...

Tetsuro Morimura, Eiji Uchibe, Junichiro Yoshimoto...

claim paper

Read More »

187

click to vote

ICML
2010
IEEE

167views Machine Learning» more ICML 2010»

Finite-Sample Analysis of LSTD

15 years 8 months ago

Download hal.inria.fr

In this paper we consider the problem of policy evaluation in reinforcement learning, i.e., learning the value function of a fixed policy, using the least-squares temporal-differe...

Alessandro Lazaric, Mohammad Ghavamzadeh, Ré...

claim paper

Read More »

« Prev « First page 1 / 1 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers