— This paper presents an algorithm for adapting periodic behavior to gradual shifts in task parameters. Since learning optimal control in high dimensional domains is subject to t...
Privacy is a concept which received relatively little attention during the rapid growth and spread of information technology through the 1980’s and 1990’s. Design to make info...
Carolyn Brodie, Clare-Marie Karat, John Karat, Jin...
We present four new reinforcement learning algorithms based on actor-critic and natural-gradient ideas, and provide their convergence proofs. Actor-critic reinforcement learning m...
Shalabh Bhatnagar, Richard S. Sutton, Mohammad Gha...
An ever-increasing amount of information on the Web today is available only through search interfaces: the users have to type in a set of keywords in a search form in order to acc...
This paper examines, by argument, the dynamics of sequences of behavioural choices made, when non-cooperative restricted-memory agents learn in partially observable stochastic gam...