Using multilayer perceptrons (MLPs) to approximate the state-action value function in reinforcement learning (RL) algorithms could become a nightmare due to the constant possibilit...
We consider reinforcement learning as solving a Markov decision process with unknown transition distribution. Based on interaction with the environment, an estimate of the transit...
Dynamic Programming, Q-learning and other discrete Markov Decision Process solvers can be applied to continuous d-dimensional state-spaces by quantizing the state space into an arr...
The reward functions that drive reinforcement learning systems are generally derived directly from the descriptions of the problems that the systems are being used to solve. In so...
Lyapunov design methods are used widely in control engineering to design controllers that achieve qualitative objectives, such as stabilizing a system or maintaining a system'...