Sciweavers

473 search results - page 34 / 95
» Optimal policy switching algorithms for reinforcement learni...
Sort
View
ALT
2008
Springer
14 years 4 months ago
Online Regret Bounds for Markov Decision Processes with Deterministic Transitions
Abstract. We consider an upper confidence bound algorithm for Markov decision processes (MDPs) with deterministic transitions. For this algorithm we derive upper bounds on the onl...
Ronald Ortner
ICML
2004
IEEE
14 years 8 months ago
Apprenticeship learning via inverse reinforcement learning
We consider learning in a Markov decision process where we are not explicitly given a reward function, but where instead we can observe an expert demonstrating the task that we wa...
Pieter Abbeel, Andrew Y. Ng
AAAI
2010
13 years 9 months ago
Reinforcement Learning Via Practice and Critique Advice
We consider the problem of incorporating end-user advice into reinforcement learning (RL). In our setting, the learner alternates between practicing, where learning is based on ac...
Kshitij Judah, Saikat Roy, Alan Fern, Thomas G. Di...
ICML
2003
IEEE
14 years 8 months ago
Relativized Options: Choosing the Right Transformation
Relativized options combine model minimization methods and a hierarchical reinforcement learning framework to derive compact reduced representations of a related family of tasks. ...
Balaraman Ravindran, Andrew G. Barto
AIIA
2007
Springer
14 years 1 months ago
Reinforcement Learning in Complex Environments Through Multiple Adaptive Partitions
The application of Reinforcement Learning (RL) algorithms to learn tasks for robots is often limited by the large dimension of the state space, which may make prohibitive its appli...
Andrea Bonarini, Alessandro Lazaric, Marcello Rest...