We present an algorithm called Optimistic Linear Programming (OLP) for learning to optimize average reward in an irreducible but otherwise unknown Markov decision process (MDP). O...
Variants of the decentralized MDP model focus on problems exhibiting some special structure that makes them easier to solve in practice. Our work is concerned with two main issues...
Markov decisionprocesses(MDPs) haveproven to be popular models for decision-theoretic planning, but standard dynamic programming algorithms for solving MDPs rely on explicit, stat...
Many stochastic planning problems can be represented using Markov Decision Processes (MDPs). A difficulty with using these MDP representations is that the common algorithms for so...
Decentralized Markov decision processes are frequently used to model cooperative multi-agent systems. In this paper, we identify a subclass of general DEC-MDPs that features regul...