We consider the task of reinforcement learning in an environment in which rare significant events occur independently of the actions selected by the controlling agent. If these ev...
We study an approach to policy selection for large relational Markov Decision Processes (MDPs). We consider a variant of approximate policy iteration (API) that replaces the usual...
The problem of reinforcement learning in large factored Markov decision processes is explored. The Q-value of a state-action pair is approximated by the free energy of a product o...
Planning in large, partially observable domains is challenging, especially when a long-horizon lookahead is necessary to obtain a good policy. Traditional POMDP planners that plan...
Partially observable Markov decision processes (POMDPs) have been
successfully applied to various robot motion planning tasks under uncertainty.
However, most existing POMDP algo...
Haoyu Bai, David Hsu, Wee Sun Lee, and Vien A. Ngo