Abstract—This paper considers an uplink time division multiple access (TDMA) cognitive radio network where multiple cognitive radios (secondary users) attempt to access a spect...
In this paper we present RETALIATE, an online reinforcement learning algorithm for developing winning policies in team firstperson shooter games. RETALIATE has three crucial chara...
Min-max functions are dynamic programming operators of zero-sum deterministic games with finite state and action spaces. The problem of computing the linear growth rate of the or...
In the Markov decision process (MDP) formalization of reinforcement learning, a single adaptive agent interacts with an environment defined by a probabilistic transition function....
We describe a generalized Q-learning type algorithm for reinforcement learning in competitive multi-agent games. We make the observation that in a competitive setting with adaptive...
Pieter Jan't Hoen, Sander M. Bohte, Han La Poutr&e...