Search Sciweavers | Sciweavers

102 search results - page 4 / 21

» Efficient Asymptotic Approximation in Temporal Difference Le...

210

Voted

NIPS
2008

130views Information Technology» more NIPS 2008»

Temporal Difference Based Actor Critic Learning - Convergence and Neural Implementation

15 years 8 months ago

Download eprints.pascal-network.org

Actor-critic algorithms for reinforcement learning are achieving renewed popularity due to their good convergence properties in situations where other approaches often fail (e.g.,...

Dotan Di Castro, Dmitry Volkinshtein, Ron Meir

claim paper

Read More »

167

click to vote

ICML
2010
IEEE

219views Machine Learning» more ICML 2010»

Convergence of Least Squares Temporal Difference Methods Under General Conditions

15 years 7 months ago

Download www.cs.helsinki.fi

We consider approximate policy evaluation for finite state and action Markov decision processes (MDP) in the off-policy learning context and with the simulation-based least square...

Huizhen Yu

claim paper

Read More »

163

click to vote

ICML
2003
IEEE

168views Machine Learning» more ICML 2003»

Bayes Meets Bellman: The Gaussian Process Approach to Temporal Difference Learning

16 years 7 months ago

Download webee.technion.ac.il

We present a novel Bayesian approach to the problem of value function estimation in continuous state spaces. We define a probabilistic generative model for the value function by i...

Yaakov Engel, Shie Mannor, Ron Meir

claim paper

Read More »

159

Voted

CORR
2010
Springer

103views Education» more CORR 2010»

Asymptotic Learning Curve and Renormalizable Condition in Statistical Learning Theory

15 years 6 months ago

Download dex-smi.sp.dis.titech.ac.jp

Bayes statistics and statistical physics have the common mathematical structure, where the log likelihood function corresponds to the random Hamiltonian. Recently, it was discovere...

Sumio Watanabe

claim paper

Read More »

194

click to vote

ICML
2010
IEEE

222views Machine Learning» more ICML 2010»

Temporal Difference Bayesian Model Averaging: A Bayesian Perspective on Adapting Lambda

15 years 4 months ago

Download www.icml2010.org

Temporal difference (TD) algorithms are attractive for reinforcement learning due to their ease-of-implementation and use of "bootstrapped" return estimates to make effi...

Carlton Downey, Scott Sanner

claim paper

Read More »

« Prev « First page 4 / 21 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers