We address the problem of autonomously learning controllers for visioncapable mobile robots. We extend McCallum's (1995) Nearest-Sequence Memory algorithm to allow for genera...
Viktor Zhumatiy, Faustino J. Gomez, Marcus Hutter,...
Two notions of optimality have been explored in previous work on hierarchical reinforcement learning (HRL): hierarchical optimality, or the optimal policy in the space defined by ...
We consider the problem of incorporating end-user advice into reinforcement learning (RL). In our setting, the learner alternates between practicing, where learning is based on ac...
Kshitij Judah, Saikat Roy, Alan Fern, Thomas G. Di...
R-max is a very simple model-based reinforcement learning algorithm which can attain near-optimal average reward in polynomial time. In R-max, the agent always maintains a complet...
Improving the sample efficiency of reinforcement learning algorithms to scale up to larger and more realistic domains is a current research challenge in machine learning. Model-ba...