Search Sciweavers | Sciweavers

109 search results - page 8 / 22

» Policy teaching through reward function learning

click to vote

AUSAI
1999
Springer

118views Artificial Intelligence» more AUSAI 1999»

Q-Learning in Continuous State and Action Spaces

13 years 12 months ago

Download users.cecs.anu.edu.au

Abstract. Q-learning can be used to learn a control policy that maximises a scalar reward through interaction with the environment. Qlearning is commonly applied to problems with d...

Chris Gaskett, David Wettergreen, Alexander Zelins...

claim paper

Read More »

click to vote

ATAL
2010
Springer

134views Intelligent Agents» more ATAL 2010»

Cultivating desired behaviour: policy teaching via environment-dynamics tweaks

13 years 8 months ago

Download eprints.ecs.soton.ac.uk

In this paper we study, for the first time explicitly, the implications of endowing an interested party (i.e. a teacher) with the ability to modify the underlying dynamics of the ...

Zinovi Rabinovich, Lachlan Dufton, Kate Larson, Ni...

claim paper

Read More »

click to vote

ACL
2009

123views Computational Linguistics» more ACL 2009»

Reinforcement Learning for Mapping Instructions to Actions

13 years 5 months ago

Download www.aclweb.org

In this paper, we present a reinforcement learning approach for mapping natural language instructions to sequences of executable actions. We assume access to a reward function tha...

S. R. K. Branavan, Harr Chen, Luke S. Zettlemoyer,...

claim paper

Read More »

click to vote

PKDD
2010
Springer

164views Data Mining» more PKDD 2010»

Efficient Planning in Large POMDPs through Policy Graph Based Factorized Approximations

13 years 5 months ago

Download users.ics.tkk.fi

Partially observable Markov decision processes (POMDPs) are widely used for planning under uncertainty. In many applications, the huge size of the POMDP state space makes straightf...

Joni Pajarinen, Jaakko Peltonen, Ari Hottinen, Mik...

claim paper

Read More »

click to vote

FLAIRS
2004

140views Artificial Intelligence» more FLAIRS 2004»

State Space Reduction For Hierarchical Reinforcement Learning

13 years 9 months ago

Download ranger.uta.edu

er provides new techniques for abstracting the state space of a Markov Decision Process (MDP). These techniques extend one of the recent minimization models, known as -reduction, ...

Mehran Asadi, Manfred Huber

claim paper

Read More »

« Prev « First page 8 / 22 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers