Sciweavers

ICANNGA
2007
Springer

Reinforcement Learning in Fine Time Discretization

14 years 5 months ago
Reinforcement Learning in Fine Time Discretization
Reinforcement Learning (RL) is analyzed here as a tool for control system optimization. State and action spaces are assumed to be continuous. Time is assumed to be discrete, yet the discretization may be arbitrarily fine. It is shown here that stationary policies, applied by most RL methods, are improper in control applications, since for fine time discretization they can not assure bounded variance of policy gradient estimators. As a remedy to that difficulty, we propose the use of piecewise non-Markov policies. Policies of this type can be optimized by means of most RL algorithms, namely those based on likelihood ratio.
Pawel Wawrzynski
Added 08 Jun 2010
Updated 08 Jun 2010
Type Conference
Year 2007
Where ICANNGA
Authors Pawel Wawrzynski
Comments (0)