Search Sciweavers | Sciweavers

16 search results - page 3 / 4

» On Average Versus Discounted Reward Temporal-Difference Lear...

179

click to vote

UAI
2001

129views Artificial Intelligence» more UAI 2001»

The Optimal Reward Baseline for Gradient-Based Reinforcement Learning

15 years 8 months ago

Download cs.anu.edu.au

There exist a number of reinforcement learning algorithms which learn by climbing the gradient of expected reward. Their long-run convergence has been proved, even in partially ob...

Lex Weaver, Nigel Tao

claim paper

Read More »

224

Voted

JMLR
2010

119views more JMLR 2010»

A Convergent Online Single Time Scale Actor Critic Algorithm

15 years 2 months ago

Download jmlr.csail.mit.edu

Actor-Critic based approaches were among the first to address reinforcement learning in a general setting. Recently, these algorithms have gained renewed interest due to their gen...

Dotan Di Castro, Ron Meir

claim paper

Read More »

177

Voted

NN
2002
Springer

79views Neural Networks» more NN 2002»

Opponent interactions between serotonin and dopamine

15 years 7 months ago

Download www.cns.nyu.edu

Anatomical and pharmacological evidence suggests that the dorsal raphe serotonin system and the ventral tegmental and substantia nigra dopamine system may act as mutual opponents....

Nathaniel D. Daw, Sham Kakade, Peter Dayan

claim paper

Read More »

224

click to vote

ICML
2007
IEEE

180views Machine Learning» more ICML 2007»

Bayesian actor-critic algorithms

16 years 8 months ago

Download www.machinelearning.org

We1 present a new actor-critic learning model in which a Bayesian class of non-parametric critics, using Gaussian process temporal difference learning is used. Such critics model ...

Mohammad Ghavamzadeh, Yaakov Engel

claim paper

Read More »

242

click to vote

BROADNETS
2004
IEEE

154views Computer Networks» more BROADNETS 2004»

Efficient QoS Provisioning for Adaptive Multimedia in Mobile Communication Networks by Reinforcement Learning

15 years 11 months ago

Download www.ece.ubc.ca

The scarcity and large fluctuations of link bandwidth in wireless networks have motivated the development of adaptive multimedia services in mobile communication networks, where i...

Fei Yu, Vincent W. S. Wong, Victor C. M. Leung

claim paper

Read More »

« Prev « First page 3 / 4 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers