Howard's policy iteration algorithm is one of the most widely used algorithms for finding optimal policies for controlling Markov Decision Processes (MDPs). When applied to we...
Markov decision processes (MDPs) and contingency planning (CP) are two widely used approaches to planning under uncertainty. MDPs are attractive because the model is extremely gen...
Partially observable Markov decision processes (POMDPs) are widely used for planning under uncertainty. In many applications, the huge size of the POMDP state space makes straightf...
Joni Pajarinen, Jaakko Peltonen, Ari Hottinen, Mik...
Recent adaptive image interpretation systems can reach optimal performance for a given domain via machine learning, without human intervention. The policies are learned over an ex...
The problem of deriving joint policies for a group of agents that maximize some joint reward function can be modeled as a decentralized partially observable Markov decision proces...
Ranjit Nair, Milind Tambe, Makoto Yokoo, David V. ...