Sciweavers

17 search results - page 3 / 4
» Fast gradient-descent methods for temporal-difference learni...
Sort
View
ICML
2006
IEEE
15 years 8 months ago
Automatic basis function construction for approximate dynamic programming and reinforcement learning
We address the problem of automatically constructing basis functions for linear approximation of the value function of a Markov Decision Process (MDP). Our work builds on results ...
Philipp W. Keller, Shie Mannor, Doina Precup
ESANN
2006
15 years 3 months ago
Magnification control for batch neural gas
Neural gas (NG) constitutes a very robust clustering algorithm which can be derived as stochastic gradient descent from a cost function closely connected to the quantization error...
Barbara Hammer, Alexander Hasenfuss, Thomas Villma...
JMLR
2006
153views more  JMLR 2006»
15 years 2 months ago
Collaborative Multiagent Reinforcement Learning by Payoff Propagation
In this article we describe a set of scalable techniques for learning the behavior of a group of agents in a collaborative multiagent setting. As a basis we use the framework of c...
Jelle R. Kok, Nikos A. Vlassis
124
Voted
ATAL
2008
Springer
15 years 4 months ago
Sigma point policy iteration
In reinforcement learning, least-squares temporal difference methods (e.g., LSTD and LSPI) are effective, data-efficient techniques for policy evaluation and control with linear v...
Michael H. Bowling, Alborz Geramifard, David Winga...
JAIR
2002
163views more  JAIR 2002»
15 years 2 months ago
Efficient Reinforcement Learning Using Recursive Least-Squares Methods
The recursive least-squares (RLS) algorithm is one of the most well-known algorithms used in adaptive filtering, system identification and adaptive control. Its popularity is main...
Xin Xu, Hangen He, Dewen Hu