We present four new reinforcement learning algorithms based on actor-critic and natural-gradient ideas, and provide their convergence proofs. Actor-critic reinforcement learning m...
Shalabh Bhatnagar, Richard S. Sutton, Mohammad Gha...
One of the key problems in reinforcement learning is balancing exploration and exploitation. Another is learning and acting in large or even continuous Markov decision processes (...
Lihong Li, Michael L. Littman, Christopher R. Mans...
We describe and analyze an algorithmic framework for online classification where each online trial consists of multiple prediction tasks that are tied together. We tackle the prob...
This paper deals with the problem of making predictions in the online mode of learning where the dependence of the outcome yt on the signal xt can change with time. The Aggregating...
In this paper, we propose a general cross-layer optimization framework in which we explicitly consider both the heterogeneous and dynamically changing characteristics of delay-sens...