Sciweavers

109 search results - page 4 / 22
» Policy teaching through reward function learning
Sort
View
IIE
2007
63views more  IIE 2007»
13 years 7 months ago
Investigation of Q-Learning in the Context of a Virtual Learning Environment
We investigate the possibility to apply a known machine learning algorithm of Q-learning in the domain of a Virtual Learning Environment (VLE). It is important in this problem doma...
Dalia Baziukaite
AAMAS
2010
Springer
13 years 7 months ago
Teaching a pet-robot to understand user feedback through interactive virtual training tasks
Abstract In this paper, we present a human-robot teaching framework that uses "virtual" games as a means for adapting a robot to its user through natural interaction in a...
Anja Austermann, Seiji Yamada
IJCAI
2007
13 years 9 months ago
Bayesian Inverse Reinforcement Learning
Inverse Reinforcement Learning (IRL) is the problem of learning the reward function underlying a Markov Decision Process given the dynamics of the system and the behaviour of an e...
Deepak Ramachandran, Eyal Amir
NECO
2010
97views more  NECO 2010»
13 years 6 months ago
Derivatives of Logarithmic Stationary Distributions for Policy Gradient Reinforcement Learning
Most conventional Policy Gradient Reinforcement Learning (PGRL) algorithms neglect (or do not explicitly make use of) a term in the average reward gradient with respect to the pol...
Tetsuro Morimura, Eiji Uchibe, Junichiro Yoshimoto...
ICML
2002
IEEE
14 years 8 months ago
Hierarchically Optimal Average Reward Reinforcement Learning
Two notions of optimality have been explored in previous work on hierarchical reinforcement learning (HRL): hierarchical optimality, or the optimal policy in the space defined by ...
Mohammad Ghavamzadeh, Sridhar Mahadevan