We apply CMA-ES, an evolution strategy with covariance matrix adaptation, and TDL (Temporal Difference Learning) to reinforcement learning tasks. In both cases these algorithms se...
How should a reinforcement learning agent act if its sole purpose is to efficiently learn an optimal policy for later use? In other words, how should it explore, to be able to exp...
We explore combining reinforcement learning with a hand-crafted local controller in a manner suggested by the chaotic control algorithm of Vincent, Schmitt and Vincent (1994). A c...
We contribute Policy Reuse as a technique to improve a reinforcement learning agent with guidance from past learned similar policies. Our method relies on using the past policies ...
Conditional random fields (CRFs) are graphical models for modeling the probability of labels given the observations. They have traditionally been trained with using a set of obser...
Xinhua Zhang, Douglas Aberdeen, S. V. N. Vishwanat...