Sciweavers

102 search results - page 4 / 21
» Efficient Asymptotic Approximation in Temporal Difference Le...
Sort
View
NIPS
2008
13 years 8 months ago
Temporal Difference Based Actor Critic Learning - Convergence and Neural Implementation
Actor-critic algorithms for reinforcement learning are achieving renewed popularity due to their good convergence properties in situations where other approaches often fail (e.g.,...
Dotan Di Castro, Dmitry Volkinshtein, Ron Meir
ICML
2010
IEEE
13 years 8 months ago
Convergence of Least Squares Temporal Difference Methods Under General Conditions
We consider approximate policy evaluation for finite state and action Markov decision processes (MDP) in the off-policy learning context and with the simulation-based least square...
Huizhen Yu
ICML
2003
IEEE
14 years 8 months ago
Bayes Meets Bellman: The Gaussian Process Approach to Temporal Difference Learning
We present a novel Bayesian approach to the problem of value function estimation in continuous state spaces. We define a probabilistic generative model for the value function by i...
Yaakov Engel, Shie Mannor, Ron Meir
CORR
2010
Springer
103views Education» more  CORR 2010»
13 years 7 months ago
Asymptotic Learning Curve and Renormalizable Condition in Statistical Learning Theory
Bayes statistics and statistical physics have the common mathematical structure, where the log likelihood function corresponds to the random Hamiltonian. Recently, it was discovere...
Sumio Watanabe
ICML
2010
IEEE
13 years 5 months ago
Temporal Difference Bayesian Model Averaging: A Bayesian Perspective on Adapting Lambda
Temporal difference (TD) algorithms are attractive for reinforcement learning due to their ease-of-implementation and use of "bootstrapped" return estimates to make effi...
Carlton Downey, Scott Sanner