Reinforcement Learning (RL) is the study of programs that improve their performance by receiving rewards and punishments from the environment. Most RL methods optimize the discoun...
In cellular telephone systems, an important problem is to dynamically allocate the communication resource channels so as to maximize service in a stochastic caller environment. Th...
Policy search is a method for approximately solving an optimal control problem by performing a parametric optimization search in a given class of parameterized policies. In order ...
Model learning combined with dynamic programming has been shown to be e ective for learning control of continuous state dynamic systems. The simplest method assumes the learned mod...
This paper presents a general method to derive tight rates of convergence for numerical approximations in optimal control when we consider variable resolution grids. We study the ...