Search Sciweavers | Sciweavers

54 search results - page 7 / 11

» Convergence Results for Single-Step On-Policy Reinforcement-...

195

click to vote

NECO
2010

103views more NECO 2010»

Posterior Weighted Reinforcement Learning with State Uncertainty

15 years 6 months ago

Download www.maths.bris.ac.uk

Reinforcement learning models generally assume that a stimulus is presented that allows a learner to unambiguously identify the state of nature, and the reward received is drawn f...

Tobias Larsen, David S. Leslie, Edmund J. Collins,...

claim paper

Read More »

208

click to vote

ICMLA
2010

205views Machine Learning» more ICMLA 2010»

Incremental Learning of Relational Action Rules

15 years 4 months ago

Download www-lipn.univ-paris13.fr

Abstract--In the Relational Reinforcement learning framework, we propose an algorithm that learns an action model allowing to predict the resulting state of each action in any give...

Christophe Rodrigues, Pierre Gérard, C&eacu...

claim paper

Read More »

214

click to vote

ICML
2001
IEEE

185views Machine Learning» more ICML 2001»

Off-Policy Temporal Difference Learning with Function Approximation

16 years 8 months ago

Download www.cs.ualberta.ca

We introduce the first algorithm for off-policy temporal-difference learning that is stable with linear function approximation. Off-policy learning is of interest because it forms...

Doina Precup, Richard S. Sutton, Sanjoy Dasgupta

claim paper

Read More »

227

click to vote

ATAL
2006
Springer

177views Intelligent Agents» more ATAL 2006»

Convergence analysis for collective vocabulary development

15 years 11 months ago

Download www.isrl.illinois.edu

We study how decentralized agents can develop a shared vocabulary without global coordination. Answering this question can help us understand the emergence of many communication s...

Jun Wang, Les Gasser, Jim Houk

claim paper

Read More »

222

click to vote

UAI
2003

172views Artificial Intelligence» more UAI 2003»

On the Convergence of Bound Optimization Algorithms

15 years 9 months ago

Download cs.nyu.edu

Many practitioners who use EM and related algorithms complain that they are sometimes slow. When does this happen, and what can be done about it? In this paper, we study the gener...

Ruslan Salakhutdinov, Sam T. Roweis, Zoubin Ghahra...

claim paper

Read More »

« Prev « First page 7 / 11 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers