Search Sciweavers | Sciweavers

115 search results - page 4 / 23

» Recurrent policy gradients

219

click to vote

NIPS
2008

110views Information Technology» more NIPS 2008»

Signal-to-Noise Ratio Analysis of Policy Gradient Algorithms

15 years 8 months ago

Download groups.csail.mit.edu

Policy gradient (PG) reinforcement learning algorithms have strong (local) convergence guarantees, but their learning performance is typically limited by a large variance in the e...

John W. Roberts, Russ Tedrake

claim paper

Read More »

194

click to vote

IJCAI
2003

169views Artificial Intelligence» more IJCAI 2003»

Covariant Policy Search

15 years 8 months ago

Download www.ri.cmu.edu

We investigate the problem of non-covariant behavior of policy gradient reinforcement learning algorithms. The policy gradient approach is amenable to analysis by information geom...

J. Andrew Bagnell, Jeff G. Schneider

claim paper

Read More »

167

click to vote

AIPS
2007

119views Artificial Intelligence» more AIPS 2007»

Concurrent Probabilistic Temporal Planning with Policy-Gradients

15 years 9 months ago

Download eprints.pascal-network.org

We present an any-time concurrent probabilistic temporal planner that includes continuous and discrete uncertainties and metric functions. Our approach is a direct policy search t...

Douglas Aberdeen, Olivier Buffet

claim paper

Read More »

223

click to vote

AAAI
2011

145views Intelligent Agents» more AAAI 2011»

Policy Gradient Planning for Environmental Decision Making with Existing Simulators

14 years 7 months ago

Download www.cs.ubc.ca

In environmental and natural resource planning domains actions are taken at a large number of locations over multiple time periods. These problems have enormous state and action s...

Mark Crowley, David Poole

claim paper

Read More »

200

click to vote

AAAI
2010

191views Intelligent Agents» more AAAI 2010»

Relative Entropy Policy Search

15 years 9 months ago

Download www.kyb.tuebingen.mpg.de

Policy search is a successful approach to reinforcement learning. However, policy improvements often result in the loss of information. Hence, it has been marred by premature conv...

Jan Peters, Katharina Mülling, Yasemin Altun

claim paper

Read More »

« Prev « First page 4 / 23 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers