Sciweavers

50 search results - page 3 / 10
» Convergence and Divergence in Standard and Averaging Reinfor...
Sort
View
SIAMCO
2000
117views more  SIAMCO 2000»
13 years 7 months ago
The O.D.E. Method for Convergence of Stochastic Approximation and Reinforcement Learning
It is shown here that stability of the stochastic approximation algorithm is implied by the asymptotic stability of the origin for an associated ODE. This in turn implies convergen...
Vivek S. Borkar, Sean P. Meyn
ICML
2010
IEEE
13 years 5 months ago
Temporal Difference Bayesian Model Averaging: A Bayesian Perspective on Adapting Lambda
Temporal difference (TD) algorithms are attractive for reinforcement learning due to their ease-of-implementation and use of "bootstrapped" return estimates to make effi...
Carlton Downey, Scott Sanner
NIPS
1994
13 years 9 months ago
Generalization in Reinforcement Learning: Safely Approximating the Value Function
To appear in: G. Tesauro, D. S. Touretzky and T. K. Leen, eds., Advances in Neural Information Processing Systems 7, MIT Press, Cambridge MA, 1995. A straightforward approach to t...
Justin A. Boyan, Andrew W. Moore
NIPS
2000
13 years 9 months ago
Programmable Reinforcement Learning Agents
We present an expressive agent design language for reinforcement learning that allows the user to constrain the policies considered by the learning process.The language includes s...
David Andre, Stuart J. Russell
ATAL
2008
Springer
13 years 9 months ago
Efficient multi-agent reinforcement learning through automated supervision
Multi-Agent Reinforcement Learning (MARL) algorithms suffer from slow convergence and even divergence, especially in large-scale systems. In this work, we develop a supervision fr...
Chongjie Zhang, Sherief Abdallah, Victor R. Lesser