Search Sciweavers | Sciweavers

109 search results - page 11 / 22

» Policy teaching through reward function learning

218

click to vote

NECO
2007

258views more NECO 2007»

Reinforcement Learning Through Modulation of Spike-Timing-Dependent Synaptic Plasticity

15 years 6 months ago

Download www.coneural.org

The persistent modiﬁcation of synaptic efﬁcacy as a function of the relative timing of pre- and postsynaptic spikes is a phenomenon known as spiketiming-dependent plasticity (...

Razvan V. Florian

claim paper

Read More »

190

Voted

ATAL
2008
Springer

123views Intelligent Agents» more ATAL 2008»

Sigma point policy iteration

15 years 8 months ago

Download web.mit.edu

In reinforcement learning, least-squares temporal difference methods (e.g., LSTD and LSPI) are effective, data-efficient techniques for policy evaluation and control with linear v...

Michael H. Bowling, Alborz Geramifard, David Winga...

claim paper

Read More »

153

Voted

ICML
2002
IEEE

138views Machine Learning» more ICML 2002»

Reinforcement Learning and Shaping: Encouraging Intended Behaviors

16 years 7 months ago

Download www.grappa.univ-lille3.fr

We explore dynamic shaping to integrate our prior beliefs of the final policy into a conventional reinforcement learning system. Shaping provides a positive or negative artificial...

Adam Laud, Gerald DeJong

claim paper

Read More »

184

click to vote

COLT
2008
Springer

179views Machine Learning» more COLT 2008»

Adapting to a Changing Environment: the Brownian Restless Bandits

15 years 8 months ago

Download research.microsoft.com

In the multi-armed bandit (MAB) problem there are k distributions associated with the rewards of playing each of k strategies (slot machine arms). The reward distributions are ini...

Aleksandrs Slivkins, Eli Upfal

claim paper

Read More »

146

click to vote

IJCNN
2008
IEEE

113views Neural Networks» more IJCNN 2008»

Uncertainty propagation for quality assurance in Reinforcement Learning

16 years 1 months ago

Download www.inb.uni-luebeck.de

— In this paper we address the reliability of policies derived by Reinforcement Learning on a limited amount of observations. This can be done in a principled manner by taking in...

Daniel Schneegaß, Steffen Udluft, Thomas Mar...

claim paper

Read More »

« Prev « First page 11 / 22 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers