Search Sciweavers | Sciweavers

13 search results - page 3 / 3

» Rollout Sampling Approximate Policy Iteration

121

click to vote

JMLR
2006

143views more JMLR 2006»

Geometric Variance Reduction in Markov Chains: Application to Value Function and Gradient Estimation

15 years 2 months ago

Download www.aaai.org

We study a sequential variance reduction technique for Monte Carlo estimation of functionals in Markov Chains. The method is based on designing sequential control variates using s...

Rémi Munos

claim paper

Read More »

118

click to vote

CORR
2010
Springer

170views Education» more CORR 2010»

Global Optimization for Value Function Approximation

15 years 2 months ago

Download www.cs.umass.edu

Existing value function approximation methods have been successfully used in many applications, but they often lack useful a priori error bounds. We propose a new approximate bili...

Marek Petrik, Shlomo Zilberstein

claim paper

Read More »

139

click to vote

CORR
2006
Springer

113views Education» more CORR 2006»

A Unified View of TD Algorithms; Introducing Full-Gradient TD and Equi-Gradient Descent TD

15 years 2 months ago

Download hal.inria.fr

This paper addresses the issue of policy evaluation in Markov Decision Processes, using linear function approximation. It provides a unified view of algorithms such as TD(), LSTD()...

Manuel Loth, Philippe Preux

claim paper

Read More »

« Prev « First page 3 / 3 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers