Gaussian Process Temporal Difference (GPTD) learning offers a Bayesian solution to the policy evaluation problem of reinforcement learning. In this paper we extend the GPTD framew...
We present a connectionist architecture that can learn a model of the relations between perceptions and actions and use this model for behavior planning. State representations are...
Abstract. Learning to act in an unknown partially observable domain is a difficult variant of the reinforcement learning paradigm. Research in the area has focused on model-free m...
Imitation can be viewed as a means of enhancing learning in multiagent environments. It augments an agent’s ability to learn useful behaviors by making intelligent use of the kn...
To gain insights into the neural basis of such adaptive decision-making processes, we investigated the nature of learning process in humans playing a competitive game with binary ...