This paper focuses on the study of the behavior of a genetic algorithm based classifier system, the Adapted Pittsburgh Classifier System (A.P.C.S), on maze type environments con...
Autonomous agents that learn about their environment can be divided into two broad classes. One class of existing learners, reinforcement learners, typically employ weak learning ...
In this paper we study a class of resource allocation games which are inspired by the El Farol Bar problem. We consider a system of competitive agents that have to choose between ...
Partially observable Markov decision processes (POMDPs) are widely used for planning under uncertainty. In many applications, the huge size of the POMDP state space makes straightf...
Joni Pajarinen, Jaakko Peltonen, Ari Hottinen, Mik...
We present Policy Gradient Actor-Critic (PGAC), a new model-free Reinforcement Learning (RL) method for creating limited-memory stochastic policies for Partially Observable Markov ...