Gaussian Process Temporal Difference (GPTD) learning offers a Bayesian solution to the policy evaluation problem of reinforcement learning. In this paper we extend the GPTD framew...
We consider reinforcement learning as solving a Markov decision process with unknown transition distribution. Based on interaction with the environment, an estimate of the transit...
POMDPs are the models of choice for reinforcement learning (RL) tasks where the environment cannot be observed directly. In many applications we need to learn the POMDP structure ...
We consider the exploration/exploitation problem in reinforcement learning (RL). The Bayesian approach to model-based RL offers an elegant solution to this problem, by considering...
— Our goal in this work is to make high level decisions for mobile robots. In particular, given a queue of prioritized object delivery tasks, we wish to find a sequence of actio...