Search Sciweavers | Sciweavers

21 search results - page 4 / 5

» Rates of Convergence of Performance Gradient Estimates Using...

185

click to vote

JMLR
2006

124views more JMLR 2006»

Policy Gradient in Continuous Time

15 years 5 months ago

Download hal.inria.fr

Policy search is a method for approximately solving an optimal control problem by performing a parametric optimization search in a given class of parameterized policies. In order ...

Rémi Munos

claim paper

Read More »

179

click to vote

NIPS
2008

165views Information Technology» more NIPS 2008»

Regularized Policy Iteration

15 years 7 months ago

Download webdocs.cs.ualberta.ca

In this paper we consider approximate policy-iteration-based reinforcement learning algorithms. In order to implement a flexible function approximation scheme we propose the use o...

Amir Massoud Farahmand, Mohammad Ghavamzadeh, Csab...

claim paper

Read More »

145

click to vote

GECCO
2006
Springer

195views Optimization» more GECCO 2006»

Studying XCS/BOA learning in Boolean functions: structure encoding and random Boolean functions

15 years 9 months ago

Download www.coboslab.psychologie.uni-wuerzburg.de

Recently, studies with the XCS classifier system on Boolean functions have shown that in certain types of functions simple crossover operators can lead to disruption and, conseque...

Martin V. Butz, Martin Pelikan

claim paper

Read More »

145

click to vote

GECCO
2006
Springer

177views Optimization» more GECCO 2006»

Hyper-ellipsoidal conditions in XCS: rotation, linear approximation, and solution structure

15 years 9 months ago

Download www.eskimo.com

The learning classifier system XCS is an iterative rulelearning system that evolves rule structures based on gradient-based prediction and rule quality estimates. Besides classifi...

Martin V. Butz, Pier Luca Lanzi, Stewart W. Wilson

claim paper

Read More »

192

click to vote

UAI
2008

242views Artificial Intelligence» more UAI 2008»

Dyna-Style Planning with Linear Function Approximation and Prioritized Sweeping

15 years 7 months ago

Download uai2008.cs.helsinki.fi

We consider the problem of efficiently learning optimal control policies and value functions over large state spaces in an online setting in which estimates must be available afte...

Richard S. Sutton, Csaba Szepesvári, Alborz...

claim paper

Read More »

« Prev « First page 4 / 5 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers