It is shown here that stability of the stochastic approximation algorithm is implied by the asymptotic stability of the origin for an associated ODE. This in turn implies convergen...
We consider the problem of energy-efficient point-to-point transmission of delay-sensitive data (e.g. multimedia data) over a fading channel. We propose a rigorous and unified fra...
In this paper we apply the recent notion of anytime universal intelligence tests to the evaluation of a popular reinforcement learning algorithm, Q-learning. We show that a general...
Dynamic Programming, Q-learning and other discrete Markov Decision Process solvers can be applied to continuous d-dimensional state-spaces by quantizing the state space into an arr...
Reinforcement learning (RL) can be impractical for many high dimensional problems because of the computational cost of doing stochastic search in large state spaces. We propose a ...