Abstract. Q-learning can be used to learn a control policy that maximises a scalar reward through interaction with the environment. Qlearning is commonly applied to problems with d...
Chris Gaskett, David Wettergreen, Alexander Zelins...
Models of agent-environment interaction that use predictive state representations (PSRs) have mainly focused on the case of discrete observations and actions. The theory of discre...
A gradient-based method for both symmetric and asymmetric multiagent reinforcement learning is introduced in this paper. Symmetric multiagent reinforcement learning addresses the ...
CBR is one of the techniques that can be applied to the task of approximating a function over high-dimensional, continuous spaces. In Reinforcement Learning systems a learning agen...
Abstract. In order to establish autonomous behavior for technical systems, the well known trade-off between reactive control and deliberative planning has to be considered. Within ...