Dynamic Programming, Q-learning and other discrete Markov Decision Process solvers can be applied to continuous d-dimensional state-spaces by quantizing the state space into an arr...
In this paper, we investigate the use of hierarchical reinforcement learning (HRL) to speed up the acquisition of cooperative multi-agent tasks. We introduce a hierarchical multi-a...
Rajbala Makar, Sridhar Mahadevan, Mohammad Ghavamz...
Learning agents, whether natural or artificial, must update their internal parameters in order to improve their behavior over time. In reinforcement learning, this plasticity is ...
We present new algorithms for inverse optimal control (or inverse reinforcement learning, IRL) within the framework of linearlysolvable MDPs (LMDPs). Unlike most prior IRL algorit...
Many popular optimization algorithms, like the Levenberg-Marquardt algorithm (LMA), use heuristic-based "controllers" that modulate the behavior of the optimizer during ...