Sciweavers

54 search results - page 7 / 11
» Convergence Results for Single-Step On-Policy Reinforcement-...
Sort
View
NECO
2010
103views more  NECO 2010»
13 years 5 months ago
Posterior Weighted Reinforcement Learning with State Uncertainty
Reinforcement learning models generally assume that a stimulus is presented that allows a learner to unambiguously identify the state of nature, and the reward received is drawn f...
Tobias Larsen, David S. Leslie, Edmund J. Collins,...
ICMLA
2010
13 years 4 months ago
Incremental Learning of Relational Action Rules
Abstract--In the Relational Reinforcement learning framework, we propose an algorithm that learns an action model allowing to predict the resulting state of each action in any give...
Christophe Rodrigues, Pierre Gérard, C&eacu...
ICML
2001
IEEE
14 years 8 months ago
Off-Policy Temporal Difference Learning with Function Approximation
We introduce the first algorithm for off-policy temporal-difference learning that is stable with linear function approximation. Off-policy learning is of interest because it forms...
Doina Precup, Richard S. Sutton, Sanjoy Dasgupta
ATAL
2006
Springer
13 years 11 months ago
Convergence analysis for collective vocabulary development
We study how decentralized agents can develop a shared vocabulary without global coordination. Answering this question can help us understand the emergence of many communication s...
Jun Wang, Les Gasser, Jim Houk
UAI
2003
13 years 8 months ago
On the Convergence of Bound Optimization Algorithms
Many practitioners who use EM and related algorithms complain that they are sometimes slow. When does this happen, and what can be done about it? In this paper, we study the gener...
Ruslan Salakhutdinov, Sam T. Roweis, Zoubin Ghahra...