Reinforcement learning promises a generic method for adapting agents to arbitrary tasks in arbitrary stochastic environments, but applying it to new real-world problems remains di...
A simulation-based optimization framework involving simultaneous perturbation stochastic approximation (SPSA) is presented as a means for optimally specifying parameters of intern...
Restricted Boltzmann machines were developed using binary stochastic hidden units. These can be generalized by replacing each binary unit by an infinite number of copies that all ...
Appropriately designing sampling policies is highly important for obtaining better control policies in reinforcement learning. In this paper, we first show that the least-squares ...
Abstract-- We consider reinforcement learning, and in particular, the Q-learning algorithm in large state and action spaces. In order to cope with the size of the spaces, a functio...