This paper examines the problem of finding an optimal policy for a Partially Observable Markov Decision Process (POMDP) when the model is not known or is only poorly specified. W...
A key problem in reinforcement learning is finding a good balance between the need to explore the environment and the need to gain rewards by exploiting existing knowledge. Much ...
We consider decentralized control of Markov decision processes and give complexity bounds on the worst-case running time for algorithms that find optimal solutions. Generalization...
Daniel S. Bernstein, Shlomo Zilberstein, Neil Imme...
We study the computational complexity of some central analysis problems for One-Counter Markov Decision Processes (OC-MDPs), a class of finitely-presented, countable-state MDPs. O...
Tomas Brazdil, Vaclav Brozek, Kousha Etessami, Ant...
We address the problem of computing an optimal value function for Markov decision processes. Since finding this function quickly and accurately requires substantial computation ef...