lative Novelty to Identify Useful Temporal Abstractions in Reinforcement Learning ?Ozg?ur S?im?sek ozgur@cs.umass.edu Andrew G. Barto barto@cs.umass.edu Department of Computer Scie...
Reinforcement learning in real-world domains suffers from three curses of dimensionality: explosions in state and action spaces, and high stochasticity. We present approaches that ...
Following Tesauro’s work on TD-Gammon, we used a 4000 parameter feed-forward neural network to develop a competitive backgammon evaluation function. Play proceeds by a roll of t...
Coordinating agents in a complex environment is a hard problem, but it can become even harder when certain characteristics of the tasks, like the required number of agents, are un...
In this article, we propose a method to adapt stepsize parameters used in reinforcement learning for dynamic environments. In general reinforcement learning situations, a stepsize...