We present the first temporal-difference learning algorithm for off-policy control with unrestricted linear function approximation whose per-time-step complexity is linear in the ...
Matching word images has many applications in document recognition and retrieval systems. Dynamic Time Warping (DTW) is popularly used to estimate the similarity between word imag...
CBR is one of the techniques that can be applied to the task of approximating a function over high-dimensional, continuous spaces. In Reinforcement Learning systems a learning agen...
Our “information-oriented” society shows an increasing exigency of life-long learning. In such framework, online Learning is becoming an important tool to allow the flexibilit...
This paper introduces an approach to automatic basis function construction for Hierarchical Reinforcement Learning (HRL) tasks. We describe some considerations that arise when con...