We introduce a new algorithm based on linear programming that approximates the differential value function of an average-cost Markov decision process via a linear combination of p...
In this paper, we describe the partially observable Markov decision process pomdp approach to nding optimal or near-optimal control strategies for partially observable stochastic ...
Anthony R. Cassandra, Leslie Pack Kaelbling, Micha...
— We consider the problem of task assignment and execution in multirobot systems, by proposing a procedure for bid estimation in auction protocols. Auctions are of interest to mu...
The use of PKI in large scale environments suffers some inherent problems concerning the options to adopt for the optimal cost-centered operation of the system. In this paper a Mar...
Agapios N. Platis, Costas Lambrinoudakis, Assimaki...
One of the key problems in reinforcement learning is balancing exploration and exploitation. Another is learning and acting in large or even continuous Markov decision processes (...
Lihong Li, Michael L. Littman, Christopher R. Mans...