We describe an approach for exploiting structure in Markov Decision Processes with continuous state variables. At each step of the dynamic programming, the state space is dynamica...
Zhengzhu Feng, Richard Dearden, Nicolas Meuleau, R...
In this paper, we propose and develop a novel approach to the problem of optimally managing the tax, and more generally debt, collections processes at financial institutions. Our...
Naoki Abe, Prem Melville, Cezar Pendus, Chandan K....
Policy evaluation is a critical step in the approximate solution of large Markov decision processes (MDPs), typically requiring O(|S|3 ) to directly solve the Bellman system of |S...
We consider model-based reinforcement learning in finite Markov Decision Processes (MDPs), focussing on so-called optimistic strategies. Optimism is usually implemented by carryin...
In this paper, we discuss the use of Targeted Trajectory Distribution Markov Decision Processes (TTD-MDPs)—a variant of MDPs in which the goal is to realize a specified distrib...
Sooraj Bhat, David L. Roberts, Mark J. Nelson, Cha...