On-Line Search for Solving Markov Decision Processes via Heuristic Sampling

16 years 28 days ago

Download www.inra.fr

In the past, Markov Decision Processes (MDPs) have become a standard for solving problems of sequential decision under uncertainty. The usual request in this framework is the computation of an optimal policy that deﬁnes the optimal action for every state of the system. For complex MDPs, exact computation of optimal policies is often untractable. Several approaches have been developed to compute near optimal policies for complex MDPs by means of function approximation and simulation. In this paper, we investigate the problem of reﬁning near optimal policies via online search techniques, tackling the local problem of ﬁnding an optimal action for a single current state of the system. More precisely we consider an on-line approach based on sampling: at each step, a randomly sampled look-ahead tree is developed to compute the optimal action for the current state. In this work, we propose a search strategy for constructing such trees. Its purpose is to provide good ”anytime” proﬁ...

Laurent Péret, Frédérick Garc

Real-time Traffic