We introduce the first algorithm for off-policy temporal-difference learning that is stable with linear function approximation. Off-policy learning is of interest because it forms...
Production scheduling, the problem of sequentially con guring a factory to meet forecasted demands, is a critical problem throughout the manufacturing industry. The requirement of...
Jeff G. Schneider, Justin A. Boyan, Andrew W. Moor...
This paper introduces an approach to automatic basis function construction for Hierarchical Reinforcement Learning (HRL) tasks. We describe some considerations that arise when con...
My research attempts to address on-line action selection in reinforcement learning from a Bayesian perspective. The idea is to develop more effective action selection techniques b...
A wide variety of function approximation schemes have been applied to reinforcement learning. However, Bayesian filtering approaches, which have been shown efficient in other field...