Search Sciweavers | Sciweavers

17 search results - page 3 / 4

» Fast gradient-descent methods for temporal-difference learni...

150

click to vote

ICML
2006
IEEE

256views Machine Learning» more ICML 2006»

Automatic basis function construction for approximate dynamic programming and reinforcement learning

15 years 8 months ago

Download www.ece.mcgill.ca

We address the problem of automatically constructing basis functions for linear approximation of the value function of a Markov Decision Process (MDP). Our work builds on results ...

Philipp W. Keller, Shie Mannor, Doina Precup

claim paper

Read More »

131

click to vote

ESANN
2006

140views Neural Networks» more ESANN 2006»

Magnification control for batch neural gas

15 years 3 months ago

Download www2.in.tu-clausthal.de

Neural gas (NG) constitutes a very robust clustering algorithm which can be derived as stochastic gradient descent from a cost function closely connected to the quantization error...

Barbara Hammer, Alexander Hasenfuss, Thomas Villma...

claim paper

Read More »

115

click to vote

JMLR
2006

153views more JMLR 2006»

Collaborative Multiagent Reinforcement Learning by Payoff Propagation

15 years 2 months ago

Download jmlr.csail.mit.edu

In this article we describe a set of scalable techniques for learning the behavior of a group of agents in a collaborative multiagent setting. As a basis we use the framework of c...

Jelle R. Kok, Nikos A. Vlassis

claim paper

Read More »

124

Voted

ATAL
2008
Springer

123views Intelligent Agents» more ATAL 2008»

Sigma point policy iteration

15 years 4 months ago

Download web.mit.edu

In reinforcement learning, least-squares temporal difference methods (e.g., LSTD and LSPI) are effective, data-efficient techniques for policy evaluation and control with linear v...

Michael H. Bowling, Alborz Geramifard, David Winga...

claim paper

Read More »

143

click to vote

JAIR
2002

163views more JAIR 2002»

Efficient Reinforcement Learning Using Recursive Least-Squares Methods

15 years 2 months ago

Download www.jair.org

The recursive least-squares (RLS) algorithm is one of the most well-known algorithms used in adaptive filtering, system identification and adaptive control. Its popularity is main...

Xin Xu, Hangen He, Dewen Hu

claim paper

Read More »

« Prev « First page 3 / 4 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers