Abstract— In this paper, we consider a class of continuoustime, continuous-space stochastic optimal control problems. Building upon recent advances in Markov chain approximation ...
Research in efficient methods for solving infinite-horizon MDPs has so far concentrated primarily on discounted MDPs and the more general stochastic shortest path problems (SSPs...
Andrey Kolobov, Mausam, Daniel S. Weld, Hector Gef...
Using standard nonlinear programming (NLP) theory, we establish formulas for first and second order directional derivatives for optimal value functions of parametric mathematical ...
We show that a compact feasible set of a standard semi-infinite optimization problem can be approximated arbitrarily well by a level set of a single smooth function with certain r...
Partially observable Markov decision processes (POMDPs) are an intuitive and general way to model sequential decision making problems under uncertainty. Unfortunately, even approx...
Tao Wang, Pascal Poupart, Michael H. Bowling, Dale...
Recently developed dual techniques allow us to evaluate a given sub-optimal dynamic portfolio policy by using the policy to construct an upper bound on the optimal value function....
We address the problem of computing an optimal value function for Markov decision processes. Since finding this function quickly and accurately requires substantial computation ef...
This paper introduces the even-odd POMDP, an approximation to POMDPs in which the world is assumed to be fully observable every other time step. The even-odd POMDP can be converte...
This paper examines the notion of symmetry in Markov decision processes (MDPs). We define symmetry for an MDP and show how it can be exploited for more effective learning in singl...
In this paper we consider sampling based fitted value iteration for discounted, large (possibly infinite) state space, finite action Markovian Decision Problems where only a gener...