Sciweavers

61 search results - page 10 / 13
» Convergence of synchronous reinforcement learning with linea...
Sort
View
GECCO
2009
Springer
200views Optimization» more  GECCO 2009»
14 years 2 months ago
Apply ant colony optimization to Tetris
Tetris is a falling block game where the player’s objective is to arrange a sequence of different shaped tetrominoes smoothly in order to survive. In the intelligence games, ag...
Xingguo Chen, Hao Wang, Weiwei Wang, Yinghuan Shi,...
CORR
2010
Springer
119views Education» more  CORR 2010»
13 years 7 months ago
Dynamic Policy Programming
In this paper, we consider the problem of planning and learning in the infinite-horizon discounted-reward Markov decision problems. We propose a novel iterative direct policysearc...
Mohammad Gheshlaghi Azar, Hilbert J. Kappen
CORR
2010
Springer
204views Education» more  CORR 2010»
13 years 6 months ago
Predictive State Temporal Difference Learning
We propose a new approach to value function approximation which combines linear temporal difference reinforcement learning with subspace identification. In practical applications...
Byron Boots, Geoffrey J. Gordon
NIPS
1998
13 years 8 months ago
Finite-Sample Convergence Rates for Q-Learning and Indirect Algorithms
In this paper, we address two issues of long-standing interest in the reinforcement learning literature. First, what kinds of performance guarantees can be made for Q-learning aft...
Michael J. Kearns, Satinder P. Singh
IJCNN
2000
IEEE
13 years 12 months ago
Piecewise Linear Homeomorphisms: The Scalar Case
The class of piecewise linear homeomorphisms (PLH) provides a convenient functional representation for many applications wherein an approximation to data is required that is inver...
Richard E. Groff, Daniel E. Koditschek, Pramod P. ...