We introduce a class of learning problems where the agent is presented with a series of tasks. Intuitively, if there is relation among those tasks, then the information gained duri...
We present an efficient "sparse sampling" technique for approximating Bayes optimal decision making in reinforcement learning, addressing the well known exploration vers...
Tao Wang, Daniel J. Lizotte, Michael H. Bowling, D...
A key problem in reinforcement learning is finding a good balance between the need to explore the environment and the need to gain rewards by exploiting existing knowledge. Much ...
Abstract. We consider a control problem where the decision maker interacts with a standard Markov decision process with the exception that the reward functions vary arbitrarily ove...
In timed, zero-sum games, the goal is to maximize the probability of winning, which is not necessarily the same as maximizing our expected reward. We consider cumulative intermedi...