In this paper the notion of a partial-order plan is extended to task-hierarchies. We introduce the concept of a partial-order taskhierarchy that decomposes a problem using multi-ta...
We introduce an approach to autonomously creating state space abstractions for an online reinforcement learning agent using a relational representation. Our approach uses a tree-b...
TD-FALCON is a self-organizing neural network that incorporates Temporal Difference (TD) methods for reinforcement learning. Despite the advantages of fast and stable learning, TD...
We present four new reinforcement learning algorithms based on actor-critic and natural-gradient ideas, and provide their convergence proofs. Actor-critic reinforcement learning m...
Shalabh Bhatnagar, Richard S. Sutton, Mohammad Gha...
er provides new techniques for abstracting the state space of a Markov Decision Process (MDP). These techniques extend one of the recent minimization models, known as -reduction, ...