We present exact algorithms for identifying deterministic-actions' effects and preconditions in dynamic partially observable domains. They apply when one does not know the ac...
Motor control depends on sensory feedback in multiple modalities with different latencies. In this paper we consider within the framework of reinforcement learning how different s...
Fredrik Bissmarck, Hiroyuki Nakahara, Kenji Doya, ...
Abstract. We formulate the problem of least squares temporal difference learning (LSTD) in the framework of least squares SVM (LS-SVM). To cope with the large amount (and possible ...
As applications for artificially intelligent agents increase in complexity we can no longer rely on clever heuristics and hand-tuned behaviors to develop their programming. Even t...
Shawn Arseneau, Wei Sun, Changpeng Zhao, Jeremy R....
The choice of a good annealing schedule is necessary for good performance of simulated annealing for combinatorial optimization problems. In this paper, we pose the simulated anne...