Sciweavers

43 search results - page 6 / 9
» The O.D.E. Method for Convergence of Stochastic Approximatio...
Sort
View
AI
1998
Springer
13 years 8 months ago
Model-Based Average Reward Reinforcement Learning
Reinforcement Learning (RL) is the study of programs that improve their performance by receiving rewards and punishments from the environment. Most RL methods optimize the discoun...
Prasad Tadepalli, DoKyeong Ok
NIPS
1996
13 years 10 months ago
Reinforcement Learning for Dynamic Channel Allocation in Cellular Telephone Systems
In cellular telephone systems, an important problem is to dynamically allocate the communication resource channels so as to maximize service in a stochastic caller environment. Th...
Satinder P. Singh, Dimitri P. Bertsekas
JMLR
2006
124views more  JMLR 2006»
13 years 8 months ago
Policy Gradient in Continuous Time
Policy search is a method for approximately solving an optimal control problem by performing a parametric optimization search in a given class of parameterized policies. In order ...
Rémi Munos
NIPS
1996
13 years 10 months ago
Exploiting Model Uncertainty Estimates for Safe Dynamic Control Learning
Model learning combined with dynamic programming has been shown to be e ective for learning control of continuous state dynamic systems. The simplest method assumes the learned mod...
Jeff G. Schneider
ICML
2000
IEEE
14 years 9 months ago
Rates of Convergence for Variable Resolution Schemes in Optimal Control
This paper presents a general method to derive tight rates of convergence for numerical approximations in optimal control when we consider variable resolution grids. We study the ...
Andrew W. Moore, Rémi Munos