We consider the problem of estimating the policy gradient in Partially Observable Markov Decision Processes (POMDPs) with a special class of policies that are based on Predictive ...
Multi-agent reinforcement learning (MARL) is an emerging area of research. However, it lacks two important elements: a coherent view on MARL, and a well-defined problem objective. ...
We present a major improvement to the incremental pruning algorithm for solving partially observable Markov decision processes. Our technique targets the cross-sum step of the dyn...
Abstract— We propose a planning algorithm that allows usersupplied domain knowledge to be exploited in the synthesis of information feedback policies for systems modeled as parti...
Salvatore Candido, James C. Davidson, Seth Hutchin...