Temporal difference reinforcement learning algorithms are perfectly suited to autonomous agents because they learn directly from an agent’s experience based on sequential actio...
We consider the problem of multi-task reinforcement learning, where the agent needs to solve a sequence of Markov Decision Processes (MDPs) chosen randomly from a fixed but unknow...
Aaron Wilson, Alan Fern, Soumya Ray, Prasad Tadepa...
In this paper the notion of a partial-order plan is extended to task-hierarchies. We introduce the concept of a partial-order taskhierarchy that decomposes a problem using multi-ta...
Reinforcement learning models generally assume that a stimulus is presented that allows a learner to unambiguously identify the state of nature, and the reward received is drawn f...
Tobias Larsen, David S. Leslie, Edmund J. Collins,...
Dynamic Programming, Q-learning and other discrete Markov Decision Process solvers can be applied to continuous d-dimensional state-spaces by quantizing the state space into an arr...