Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

57

ICML
2001
IEEE

favoriteEmaildiscussreport

127views Machine Learning» more ICML 2001»

Convergence of Gradient Dynamics with a Variable Learning Rate

15 years 6 months ago

Convergence of Gradient Dynamics with a Variable Learning Rate

Download www.cs.cmu.edu

As multiagent environments become more prevalent we need to understand how this changes the agent-based paradigm. One aspect that is heavily affected by the presence of multiple agents is learning. Traditional learning algorithms have core assumptions, such as Markovian transitions, which are violated in these environments. Yet, understanding the behavior of learning algorithms in these domains is critical. Singh, Kearns, and Mansour (2000) examine gradient ascent learning, specifically within a restricted class of repeated matrix games. They prove that when using this technique the average of expected payoffs over time converges. On the other hand, they also show that neither the players' strategies nor their expected payoffs themselves are guaranteed to converge. In this paper we introduce a variable learning rate for gradient ascent, along with the WoLF ("Win or Learn Fast") principle for regulating the learning rate. We then prove that this modification to gradient ...

Michael H. Bowling, Manuela M. Veloso

Real-time Traffic

Gradient Ascent Learning | ICML 2001 | Machine Learning | Traditional Learning Algorithms | Variable Learning Rate |

claim paper

Related Content

» A Generalized Gradient Scheduling Algorithm in Wireless Networks for Variable Rate Transmi...

» Convergence and convergence rate of stochastic gradient search in the case of multiple and...

» Rates of Convergence of Performance Gradient Estimates Using Function Approximation and Bi...

» Improving the Convergence of the Backpropagation Algorithm Using Local Adaptive Techniques

» Learning Gradients Predictive Models that Infer Geometry and Statistical Dependence

» Learning Coordinate Gradients with MultiTask Kernels

» Gradient LASSO for feature selection

» TD0 Converges Provably Faster than the Residual Gradient Algorithm

» Rates of Convergence for Variable Resolution Schemes in Optimal Control

Post Info
More Details (n/a)

Added	17 Nov 2009
Updated	17 Nov 2009
Type	Conference
Year	2001
Where	ICML
Authors	Michael H. Bowling, Manuela M. Veloso

Comments (0)