Abstract— We describe a general method to transform a non-Markovian sequential decision problem into a supervised learning problem using a K-bestpaths algorithm. We consider an a...
This paper introduces a new approach to actionvalue function approximation by learning basis functions from a spectral decomposition of the state-action manifold. This paper exten...
We study multivariate approximation for continuous functions in the average case setting. The space of d variate continuous functions is equipped with the zero mean Gaussian measu...
One of the key problems in reinforcement learning is balancing exploration and exploitation. Another is learning and acting in large or even continuous Markov decision processes (...
Lihong Li, Michael L. Littman, Christopher R. Mans...
Poisson regression models the noisy output of a counting function as a Poisson random variable, with a log-mean parameter that is a linear function of the input vector. In this wo...