This paper merges hierarchical reinforcement learning (HRL) with ant colony optimization (ACO) to produce a HRL ACO algorithm capable of generating solutions for large domains. Th...
Reinforcement learning (RL) algorithms attempt to assign the credit for rewards to the actions that contributed to the reward. Thus far, credit assignment has been done in one of t...
We present four new reinforcement learning algorithms based on actor-critic and natural-gradient ideas, and provide their convergence proofs. Actor-critic reinforcement learning m...
Shalabh Bhatnagar, Richard S. Sutton, Mohammad Gha...
Closed-loop control relies on sensory feedback that is usually assumed to be free. But if sensing incurs a cost, it may be coste ective to take sequences of actions in open-loop m...
Eric A. Hansen, Andrew G. Barto, Shlomo Zilberstei...
To accelerate the learning of reinforcement learning, many types of function approximation are used to represent state value. However function approximation reduces the accuracy o...