Sciweavers

ACL
2009
13 years 9 months ago
Reinforcement Learning for Mapping Instructions to Actions
In this paper, we present a reinforcement learning approach for mapping natural language instructions to sequences of executable actions. We assume access to a reward function tha...
S. R. K. Branavan, Harr Chen, Luke S. Zettlemoyer,...
NIPS
2004
14 years 25 days ago
Experts in a Markov Decision Process
We consider an MDP setting in which the reward function is allowed to change during each time step of play (possibly in an adversarial manner), yet the dynamics remain fixed. Simi...
Eyal Even-Dar, Sham M. Kakade, Yishay Mansour
NIPS
2007
14 years 26 days ago
A Game-Theoretic Approach to Apprenticeship Learning
We study the problem of an apprentice learning to behave in an environment with an unknown reward function by observing the behavior of an expert. We follow on the work of Abbeel ...
Umar Syed, Robert E. Schapire
ESANN
2008
14 years 27 days ago
Learning to play Tetris applying reinforcement learning methods
In this paper the application of reinforcement learning to Tetris is investigated, particulary the idea of temporal difference learning is applied to estimate the state value funct...
Alexander Groß, Jan Friedland, Friedhelm Sch...
AAAI
2010
14 years 27 days ago
Robust Policy Computation in Reward-Uncertain MDPs Using Nondominated Policies
The precise specification of reward functions for Markov decision processes (MDPs) is often extremely difficult, motivating research into both reward elicitation and the robust so...
Kevin Regan, Craig Boutilier
AAAI
2007
14 years 1 months ago
Authorial Idioms for Target Distributions in TTD-MDPs
In designing Markov Decision Processes (MDP), one must define the world, its dynamics, a set of actions, and a reward function. MDPs are often applied in situations where there i...
David L. Roberts, Sooraj Bhat, Kenneth St. Clair, ...
SIGGRAPH
2010
ACM
14 years 3 months ago
Learning behavior styles with inverse reinforcement learning
We present a method for inferring the behavior styles of character controllers from a small set of examples. We show that a rich set of behavior variations can be captured by dete...
Seong Jae Lee, Zoran Popovic
ISCAS
2006
IEEE
103views Hardware» more  ISCAS 2006»
14 years 5 months ago
Towards autonomous adaptive behavior in a bio-inspired CNN-controlled robot
— This paper describes a general approach for the unsupervised learning of behaviors in a behavior-based robot. The key idea is to formalize a behavior produced by a Motor Map dr...
Paolo Arena, Luigi Fortuna, Mattia Frasca, Luca Pa...
PKDD
2009
Springer
181views Data Mining» more  PKDD 2009»
14 years 6 months ago
Active Learning for Reward Estimation in Inverse Reinforcement Learning
Abstract. Inverse reinforcement learning addresses the general problem of recovering a reward function from samples of a policy provided by an expert/demonstrator. In this paper, w...
Manuel Lopes, Francisco S. Melo, Luis Montesano
ICML
2003
IEEE
15 years 7 days ago
Q-Decomposition for Reinforcement Learning Agents
The paper explores a very simple agent design method called Q-decomposition, wherein a complex agent is built from simpler subagents. Each subagent has its own reward function and...
Stuart J. Russell, Andrew Zimdars