We consider the problem of several users transmitting packets to a base station, and study an optimal scheduling formulation involving three communication layers, namely, the mediu...
We give a closed-form expression for the discounted weighted queue length and switching costs of a two-class single-server queueing model under a preemptive priority rule. These e...
We consider the Bellman residual minimization approach for solving discounted Markov decision problems, where we assume that a generative model of the dynamics and rewards is avai...
Abstract. In this paper we compare state-of-the-art multi-agent reinforcement learning algorithms in a wide variety of games. We consider two types of algorithms: value iteration a...
H. Jaap van den Herik, Daniel Hennes, Michael Kais...
We propose a novel algorithm called GA-MDP for solving the frequency assigment problem. GA-MDP inherits the spirit of genetic algorithms with an adaptation of Markov Decision Proc...