Sciweavers

437 search results - page 5 / 88
» Policy Gradient Critics
Sort
View
NIPS
2008
13 years 9 months ago
Signal-to-Noise Ratio Analysis of Policy Gradient Algorithms
Policy gradient (PG) reinforcement learning algorithms have strong (local) convergence guarantees, but their learning performance is typically limited by a large variance in the e...
John W. Roberts, Russ Tedrake
ANOR
2004
155views more  ANOR 2004»
13 years 7 months ago
Model-Based Search for Combinatorial Optimization: A Critical Survey
In this paper we introduce model-based search as a unifying framework accommodating some recently proposed metaheuristics for combinatorial optimization such as ant colony optimiza...
Mark Zlochin, Mauro Birattari, Nicolas Meuleau, Ma...
IJCAI
2003
13 years 9 months ago
Covariant Policy Search
We investigate the problem of non-covariant behavior of policy gradient reinforcement learning algorithms. The policy gradient approach is amenable to analysis by information geom...
J. Andrew Bagnell, Jeff G. Schneider
AIPS
2007
13 years 10 months ago
Concurrent Probabilistic Temporal Planning with Policy-Gradients
We present an any-time concurrent probabilistic temporal planner that includes continuous and discrete uncertainties and metric functions. Our approach is a direct policy search t...
Douglas Aberdeen, Olivier Buffet
AAAI
2011
12 years 7 months ago
Policy Gradient Planning for Environmental Decision Making with Existing Simulators
In environmental and natural resource planning domains actions are taken at a large number of locations over multiple time periods. These problems have enormous state and action s...
Mark Crowley, David Poole