Search Sciweavers | Sciweavers

133 search results - page 6 / 27

» Hierarchical Policy Gradient Algorithms

click to vote

SODA
1997
ACM

98views Algorithms» more SODA 1997»

Optimal Good-Aspect-Ratio Coarsening for Unstructured Meshes

13 years 11 months ago

Download www.cs.berkeley.edu

A hierarchical gradient of an unstructured mesh M0 is a sequence of meshes M1;...;Mk such that jMkj is smaller than a given threshold mesh size b. The gradient is well-conditioned...

Gary L. Miller, Dafna Talmor, Shang-Hua Teng

claim paper

Read More »

click to vote

NIPS
1998

140views Information Technology» more NIPS 1998»

Gradient Descent for General Reinforcement Learning

13 years 11 months ago

Download www.ri.cmu.edu

A simple learning rule is derived, the VAPS algorithm, which can be instantiated to generate a wide range of new reinforcementlearning algorithms. These algorithms solve a number ...

Leemon C. Baird III, Andrew W. Moore

claim paper

Read More »

click to vote

ICML
2007
IEEE

180views Machine Learning» more ICML 2007»

Bayesian actor-critic algorithms

14 years 10 months ago

Download www.machinelearning.org

We1 present a new actor-critic learning model in which a Bayesian class of non-parametric critics, using Gaussian process temporal difference learning is used. Such critics model ...

Mohammad Ghavamzadeh, Yaakov Engel

claim paper

Read More »

click to vote

NIPS
2003

180views Information Technology» more NIPS 2003»

Bounded Finite State Controllers

13 years 11 months ago

Download books.nips.cc

We describe a new approximation algorithm for solving partially observable MDPs. Our bounded policy iteration approach searches through the space of bounded-size, stochastic ﬁni...

Pascal Poupart, Craig Boutilier

claim paper

Read More »

click to vote

ICML
2000
IEEE

126views Machine Learning» more ICML 2000»

Reinforcement Learning in POMDP's via Direct Gradient Ascent

14 years 10 months ago

Download reference.kfupm.edu.sa

This paper discusses theoretical and experimental aspects of gradient-based approaches to the direct optimization of policy performance in controlled ??? ?s. We introduce ??? ?, a...

Jonathan Baxter, Peter L. Bartlett

claim paper

Read More »

« Prev « First page 6 / 27 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers