We consider an MDP setting in which the reward function is allowed to change during each time step of play (possibly in an adversarial manner), yet the dynamics remain fixed. Simi...
This paper examines the problem of finding an optimal policy for a Partially Observable Markov Decision Process (POMDP) when the model is not known or is only poorly specified. W...
Magnifying Lens Abstraction in Markov Decision Processes ∗ Pritam Roy1 David Parker2 Gethin Norman2 Luca de Alfaro1 Computer Engineering Dept, UC Santa Cruz, Santa Cruz, CA, USA ...
Pritam Roy, David Parker, Gethin Norman, Luca de A...
We consider decentralized control of Markov decision processes and give complexity bounds on the worst-case running time for algorithms that find optimal solutions. Generalization...
Daniel S. Bernstein, Shlomo Zilberstein, Neil Imme...
The majority of the work in the area of Markov decision processes has focused on expected values of rewards in the objective function and expected costs in the constraints. Althou...