Sciweavers

ICML
2009
IEEE

Model-free reinforcement learning as mixture learning

15 years 1 months ago
Model-free reinforcement learning as mixture learning
We cast model-free reinforcement learning as the problem of maximizing the likelihood of a probabilistic mixture model via sampling, addressing both the infinite and finite horizon cases. We describe a Stochastic Approximation EM algorithm for likelihood maximization that, in the tabular case, is equivalent to a non-bootstrapping optimistic policy iteration algorithm like Sarsa(1) that can be applied both in MDPs and POMDPs. On the theoretical side, by relating the proposed stochastic EM algorithm to the family of optimistic policy iteration algorithms, we provide new tools that permit the design and analysis of algorithms in that family. On the practical side, preliminary experiments on a POMDP problem demonstrated encouraging results.
Nikos Vlassis, Marc Toussaint
Added 17 Nov 2009
Updated 17 Nov 2009
Type Conference
Year 2009
Where ICML
Authors Nikos Vlassis, Marc Toussaint
Comments (0)