reward function | Sciweavers

55

ACL
2009

123views Computational Linguistics» more ACL 2009»

Reinforcement Learning for Mapping Instructions to Actions

14 years 5 months ago

In this paper, we present a reinforcement learning approach for mapping natural language instructions to sequences of executable actions. We assume access to a reward function tha...

S. R. K. Branavan, Harr Chen, Luke S. Zettlemoyer,...

claim paper

Read More »

81

click to vote

NIPS
2004

103views Information Technology» more NIPS 2004»

Experts in a Markov Decision Process

14 years 9 months ago

Download books.nips.cc

We consider an MDP setting in which the reward function is allowed to change during each time step of play (possibly in an adversarial manner), yet the dynamics remain fixed. Simi...

Eyal Even-Dar, Sham M. Kakade, Yishay Mansour

claim paper

Read More »

60

click to vote

NIPS
2007

143views Information Technology» more NIPS 2007»

A Game-Theoretic Approach to Apprenticeship Learning

14 years 9 months ago

Download books.nips.cc

We study the problem of an apprentice learning to behave in an environment with an unknown reward function by observing the behavior of an expert. We follow on the work of Abbeel ...

Umar Syed, Robert E. Schapire

claim paper

Read More »

87

click to vote

ESANN
2008

278views Neural Networks» more ESANN 2008»

Learning to play Tetris applying reinforcement learning methods

14 years 9 months ago

Download www.dice.ucl.ac.be

In this paper the application of reinforcement learning to Tetris is investigated, particulary the idea of temporal difference learning is applied to estimate the state value funct...

Alexander Groß, Jan Friedland, Friedhelm Sch...

claim paper

Read More »

65

click to vote

AAAI
2010

136views Intelligent Agents» more AAAI 2010»

Robust Policy Computation in Reward-Uncertain MDPs Using Nondominated Policies

14 years 9 months ago

Download www.cs.toronto.edu

The precise specification of reward functions for Markov decision processes (MDPs) is often extremely difficult, motivating research into both reward elicitation and the robust so...

Kevin Regan, Craig Boutilier

claim paper

Read More »

86

click to vote

AAAI
2007

117views Intelligent Agents» more AAAI 2007»

Authorial Idioms for Target Distributions in TTD-MDPs

14 years 10 months ago

Download www.cc.gatech.edu

In designing Markov Decision Processes (MDP), one must deﬁne the world, its dynamics, a set of actions, and a reward function. MDPs are often applied in situations where there i...

David L. Roberts, Sooraj Bhat, Kenneth St. Clair, ...

claim paper

Read More »

70

click to vote

SIGGRAPH
2010
ACM

295views Computer Graphics» more SIGGRAPH 2010»

Learning behavior styles with inverse reinforcement learning

15 years 1 days ago

Download grail.cs.washington.edu

We present a method for inferring the behavior styles of character controllers from a small set of examples. We show that a rich set of behavior variations can be captured by dete...

Seong Jae Lee, Zoran Popovic

claim paper

Read More »

59

click to vote

ISCAS
2006
IEEE

103views Hardware» more ISCAS 2006»

Towards autonomous adaptive behavior in a bio-inspired CNN-controlled robot

15 years 1 months ago

Download web.mit.edu

— This paper describes a general approach for the unsupervised learning of behaviors in a behavior-based robot. The key idea is to formalize a behavior produced by a Motor Map dr...

Paolo Arena, Luigi Fortuna, Mattia Frasca, Luca Pa...

claim paper

Read More »

66

click to vote

PKDD
2009
Springer

181views Data Mining» more PKDD 2009»

Active Learning for Reward Estimation in Inverse Reinforcement Learning

15 years 2 months ago

Download users.isr.ist.utl.pt

Abstract. Inverse reinforcement learning addresses the general problem of recovering a reward function from samples of a policy provided by an expert/demonstrator. In this paper, w...

Manuel Lopes, Francisco S. Melo, Luis Montesano

claim paper

Read More »

57

click to vote

ICML
2003
IEEE

121views Machine Learning» more ICML 2003»

Q-Decomposition for Reinforcement Learning Agents

15 years 8 months ago

Download www.hpl.hp.com

The paper explores a very simple agent design method called Q-decomposition, wherein a complex agent is built from simpler subagents. Each subagent has its own reward function and...

Stuart J. Russell, Andrew Zimdars

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers