Sciweavers

343 search results - page 39 / 69
» Action discovery for reinforcement learning
Sort
View
JMLR
2012
11 years 10 months ago
Contextual Bandit Learning with Predictable Rewards
Contextual bandit learning is a reinforcement learning problem where the learner repeatedly receives a set of features (context), takes an action and receives a reward based on th...
Alekh Agarwal, Miroslav Dudík, Satyen Kale,...
JMLR
2002
125views more  JMLR 2002»
13 years 7 months ago
Lyapunov Design for Safe Reinforcement Learning
Lyapunov design methods are used widely in control engineering to design controllers that achieve qualitative objectives, such as stabilizing a system or maintaining a system'...
Theodore J. Perkins, Andrew G. Barto
ICML
2003
IEEE
14 years 8 months ago
Hierarchical Policy Gradient Algorithms
Hierarchical reinforcement learning is a general framework which attempts to accelerate policy learning in large domains. On the other hand, policy gradient reinforcement learning...
Mohammad Ghavamzadeh, Sridhar Mahadevan
CIRA
2007
IEEE
148views Robotics» more  CIRA 2007»
14 years 2 months ago
Reinforcement Learning with a Supervisor for a Mobile Robot in a Real-world Environment
– This paper describes two experiments with supervised reinforcement learning (RL) on a real, mobile robot. Two types of experiments were preformed. One tests the robot’s relia...
Karla Conn, Richard Alan Peters II
JUCS
2007
98views more  JUCS 2007»
13 years 7 months ago
Focus of Attention in Reinforcement Learning
Abstract: Classification-based reinforcement learning (RL) methods have recently been proposed as an alternative to the traditional value-function based methods. These methods use...
Lihong Li, Vadim Bulitko, Russell Greiner