Search Sciweavers | Sciweavers

473 search results - page 62 / 95

» Optimal policy switching algorithms for reinforcement learni...

205

click to vote

JMLR
2012

200views Programming Languages» more JMLR 2012»

Contextual Bandit Learning with Predictable Rewards

13 years 8 months ago

Download www.cs.princeton.edu

Contextual bandit learning is a reinforcement learning problem where the learner repeatedly receives a set of features (context), takes an action and receives a reward based on th...

Alekh Agarwal, Miroslav Dudík, Satyen Kale,...

claim paper

Read More »

170

click to vote

KI
2007
Springer

124views Artificial Intelligence» more KI 2007»

Making a Robot Learn to Play Soccer Using Reward and Punishment

15 years 12 months ago

Download www.ni.uos.de

In this paper, we show how reinforcement learning can be applied to real robots to achieve optimal robot behavior. As example, we enable an autonomous soccer robot to learn interce...

Heiko Müller, Martin Lauer, Roland Hafner, Sa...

claim paper

Read More »

165

click to vote

ATAL
2003
Springer

176views Intelligent Agents» more ATAL 2003»

A selection-mutation model for q-learning in multi-agent systems

15 years 11 months ago

Download www.personeel.unimaas.nl

Although well understood in the single-agent framework, the use of traditional reinforcement learning (RL) algorithms in multi-agent systems (MAS) is not always justiﬁed. The fe...

Karl Tuyls, Katja Verbeeck, Tom Lenaerts

claim paper

Read More »

151

click to vote

ICML
2009
IEEE

148views Machine Learning» more ICML 2009»

Predictive representations for policy gradient in POMDPs

16 years 6 months ago

Download damas.ift.ulaval.ca

We consider the problem of estimating the policy gradient in Partially Observable Markov Decision Processes (POMDPs) with a special class of policies that are based on Predictive ...

Abdeslam Boularias, Brahim Chaib-draa

claim paper

Read More »

170

click to vote

CIS
2005
Springer

129views Applied Computing» more CIS 2005»

An RLS-Based Natural Actor-Critic Algorithm for Locomotion of a Two-Linked Robot Arm

15 years 11 months ago

Download www-clmc.usc.edu

Recently, actor-critic methods have drawn much interests in the area of reinforcement learning, and several algorithms have been studied along the line of the actor-critic strategy...

Jooyoung Park, Jongho Kim, Daesung Kang

claim paper

Read More »

« Prev « First page 62 / 95 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers