Search Sciweavers | Sciweavers

109 search results - page 7 / 22

» Policy teaching through reward function learning

222

click to vote

JMLR
2012

200views Programming Languages» more JMLR 2012»

Contextual Bandit Learning with Predictable Rewards

13 years 9 months ago

Download www.cs.princeton.edu

Contextual bandit learning is a reinforcement learning problem where the learner repeatedly receives a set of features (context), takes an action and receives a reward based on th...

Alekh Agarwal, Miroslav Dudík, Satyen Kale,...

claim paper

Read More »

159

click to vote

EWRL
2008

129views Machine Learning» more EWRL 2008»

Markov Decision Processes with Arbitrary Reward Processes

15 years 8 months ago

Download www.cim.mcgill.ca

Abstract. We consider a control problem where the decision maker interacts with a standard Markov decision process with the exception that the reward functions vary arbitrarily ove...

Jia Yuan Yu, Shie Mannor, Nahum Shimkin

claim paper

Read More »

166

click to vote

ICML
2000
IEEE

126views Machine Learning» more ICML 2000»

Reinforcement Learning in POMDP's via Direct Gradient Ascent

16 years 7 months ago

Download reference.kfupm.edu.sa

This paper discusses theoretical and experimental aspects of gradient-based approaches to the direct optimization of policy performance in controlled ??? ?s. We introduce ??? ?, a...

Jonathan Baxter, Peter L. Bartlett

claim paper

Read More »

170

click to vote

EUROCAST
2007
Springer

182views Hardware» more EUROCAST 2007»

A k-NN Based Perception Scheme for Reinforcement Learning

16 years 25 days ago

Download www.dia.fi.upm.es

Abstract a paradigm of modern Machine Learning (ML) which uses rewards and punishments to guide the learning process. One of the central ideas of RL is learning by “direct-online...

José Antonio Martin H., Javier de Lope Asia...

claim paper

Read More »

169

click to vote

CCIA
2005
Springer

117views Artificial Intelligence» more CCIA 2005»

Direct Policy Search Reinforcement Learning for Robot Control

16 years 6 days ago

Download vicorob.udg.es

— This paper proposes a high-level Reinforcement Learning (RL) control system for solving the action selection problem of an autonomous robot. Although the dominant approach, whe...

Andres El-Fakdi, Marc Carreras, Narcís Palo...

claim paper

Read More »

« Prev « First page 7 / 22 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers