Reinforcement learning problems are commonly tackled with temporal difference methods, which attempt to estimate the agent's optimal value function. In most real-world proble...
We present a novel Bayesian approach to the problem of value function estimation in continuous state spaces. We define a probabilistic generative model for the value function by i...
Abstract— This paper compares the use of temporal difference learning (TDL) versus co-evolutionary learning (CEL) for acquiring position evaluation functions for the game of Othe...
To avoid the curse of dimensionality, function approximators are used in reinforcement learning to learn value functions for individual states. In order to make better use of comp...
We introduce relational temporal difference learning as an effective approach to solving multi-agent Markov decision problems with large state spaces. Our algorithm uses temporal ...