Sciweavers

10 search results - page 2 / 2
» Coevolutionary Temporal Difference Learning for small-board ...
Sort
View
NIPS
1996
13 years 10 months ago
Why did TD-Gammon Work?
Although TD-Gammon is one of the major successes in machine learning, it has not led to similar impressive breakthroughs in temporal difference learning for other applications or ...
Jordan B. Pollack, Alan D. Blair
IJCAI
2007
13 years 10 months ago
Reinforcement Learning of Local Shape in the Game of Go
We explore an application to the game of Go of a reinforcement learning approach based on a linear evaluation function and large numbers of binary features. This strategy has prov...
David Silver, Richard S. Sutton, Martin Mülle...
ACG
2003
Springer
14 years 1 months ago
Evaluation in Go by a Neural Network using Soft Segmentation
In this article a neural network architecture is presented that is able to build a soft segmentation of a two-dimensional input. This network architecture is applied to position ev...
Markus Enzenberger
JMLR
2002
100views more  JMLR 2002»
13 years 8 months ago
On the Convergence of Optimistic Policy Iteration
We consider a finite-state Markov decision problem and establish the convergence of a special case of optimistic policy iteration that involves Monte Carlo estimation of Q-values,...
John N. Tsitsiklis
ESANN
2008
13 years 10 months ago
Learning to play Tetris applying reinforcement learning methods
In this paper the application of reinforcement learning to Tetris is investigated, particulary the idea of temporal difference learning is applied to estimate the state value funct...
Alexander Groß, Jan Friedland, Friedhelm Sch...