Dyna-Style Planning with Linear Function Approximation and Prioritized Sweeping

15 years 8 months ago

Download uai2008.cs.helsinki.fi

We consider the problem of efficiently learning optimal control policies and value functions over large state spaces in an online setting in which estimates must be available after each interaction with the world. This paper develops an explicitly model-based approach extending the Dyna architecture to linear function approximation. Dynastyle planning proceeds by generating imaginary experience from the world model and then applying model-free reinforcement learning algorithms to the imagined state transitions. Our main results are to prove that linear Dyna-style planning converges to a unique solution independent of the generating distribution, under natural conditions. In the policy evaluation setting, we prove that the limit point is the least-squares (LSTD) solution. An implication of our results is that prioritized-sweeping can be soundly extended to the linear approximation case, backing up to preceding features rather than to preceding states. We introduce two versions of prior...

Richard S. Sutton, Csaba Szepesvári, Alborz

Real-time Traffic

Artificial Intelligence | Dyna Architecture | Linear | Planning | UAI 2008 |

claim paper

Post Info
More Details (n/a)

Added	30 Oct 2010
Updated	30 Oct 2010
Type	Conference
Year	2008
Where	UAI
Authors	Richard S. Sutton, Csaba Szepesvári, Alborz Geramifard, Michael H. Bowling

Comments (0)

Sciweavers

Dyna-Style Planning with Linear Function Approximation and Prioritized Sweeping

Artificial Intelligence | Dyna Architecture | Linear | Planning | UAI 2008 |

Explore & Download

Productivity Tools

Sciweavers