We explore an application to the game of Go of a reinforcement learning approach based on a linear evaluation function and large numbers of binary features. This strategy has prov...
Recurrent neural networks are theoretically capable of learning complex temporal sequences, but training them through gradient-descent is too slow and unstable for practical use i...
The options framework provides a method for reinforcement learning agents to build new high-level skills. However, since options are usually learned in the same state space as the...
We consider the problem of energy-efficient point-to-point transmission of delay-sensitive data (e.g. multimedia data) over a fading channel. We propose a rigorous and unified fra...
We present four new reinforcement learning algorithms based on actor-critic and natural-gradient ideas, and provide their convergence proofs. Actor-critic reinforcement learning m...
Shalabh Bhatnagar, Richard S. Sutton, Mohammad Gha...