Sciweavers

17 search results - page 3 / 4
» Fast gradient-descent methods for temporal-difference learni...
Sort
View
ICML
2006
IEEE
14 years 28 days ago
Automatic basis function construction for approximate dynamic programming and reinforcement learning
We address the problem of automatically constructing basis functions for linear approximation of the value function of a Markov Decision Process (MDP). Our work builds on results ...
Philipp W. Keller, Shie Mannor, Doina Precup
ESANN
2006
13 years 8 months ago
Magnification control for batch neural gas
Neural gas (NG) constitutes a very robust clustering algorithm which can be derived as stochastic gradient descent from a cost function closely connected to the quantization error...
Barbara Hammer, Alexander Hasenfuss, Thomas Villma...
JMLR
2006
153views more  JMLR 2006»
13 years 7 months ago
Collaborative Multiagent Reinforcement Learning by Payoff Propagation
In this article we describe a set of scalable techniques for learning the behavior of a group of agents in a collaborative multiagent setting. As a basis we use the framework of c...
Jelle R. Kok, Nikos A. Vlassis
ATAL
2008
Springer
13 years 9 months ago
Sigma point policy iteration
In reinforcement learning, least-squares temporal difference methods (e.g., LSTD and LSPI) are effective, data-efficient techniques for policy evaluation and control with linear v...
Michael H. Bowling, Alborz Geramifard, David Winga...
JAIR
2002
163views more  JAIR 2002»
13 years 6 months ago
Efficient Reinforcement Learning Using Recursive Least-Squares Methods
The recursive least-squares (RLS) algorithm is one of the most well-known algorithms used in adaptive filtering, system identification and adaptive control. Its popularity is main...
Xin Xu, Hangen He, Dewen Hu