Search Sciweavers | Sciweavers

1118 search results - page 7 / 224

» Relational temporal difference learning

134

click to vote

ML
2002
ACM

168views Machine Learning» more ML 2002»

On Average Versus Discounted Reward Temporal-Difference Learning

15 years 2 months ago

Download web.mit.edu

We provide an analytical comparison between discounted and average reward temporal-difference (TD) learning with linearly parameterized approximations. We first consider the asympt...

John N. Tsitsiklis, Benjamin Van Roy

claim paper

Read More »

110

Voted

ICML
2003
IEEE

168views Machine Learning» more ICML 2003»

Bayes Meets Bellman: The Gaussian Process Approach to Temporal Difference Learning

16 years 3 months ago

Download webee.technion.ac.il

We present a novel Bayesian approach to the problem of value function estimation in continuous state spaces. We define a probabilistic generative model for the value function by i...

Yaakov Engel, Shie Mannor, Ron Meir

claim paper

Read More »

104

click to vote

ML
2000
ACM

126views Machine Learning» more ML 2000»

Learning to Play Chess Using Temporal Differences

15 years 2 months ago

Download www.cs.princeton.edu

In this paper we present TDLEAF( ), a variation on the TD( ) algorithm that enables it to be used in conjunction with game-tree search. We present some experiments in which our che...

Jonathan Baxter, Andrew Tridgell, Lex Weaver

claim paper

Read More »

148

Voted

ICML
1999
IEEE

168views Machine Learning» more ICML 1999»

Least-Squares Temporal Difference Learning

16 years 3 months ago

Download www.research.rutgers.edu

Excerpted from: Boyan, Justin. Learning Evaluation Functions for Global Optimization. Ph.D. thesis, Carnegie Mellon University, August 1998. (Available as Technical Report CMU-CS-...

Justin A. Boyan

claim paper

Read More »

133

click to vote

ISDA
2009
IEEE

144views Operating System» more ISDA 2009»

Postponed Updates for Temporal-Difference Reinforcement Learning

15 years 9 months ago

Download www.science.uva.nl

This paper presents postponed updates, a new strategy for TD methods that can improve sample efﬁciency without incurring the computational and space requirements of model-based ...

Harm van Seijen, Shimon Whiteson

claim paper

Read More »

« Prev « First page 7 / 224 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers