Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

168

CORR
2010
Springer

127views Education» more CORR 2010»

Mean field for Markov Decision Processes: from Discrete to Continuous Optimization

15 years 6 months ago

Mean field for Markov Decision Processes: from Discrete to Continuous Optimization

Download infoscience.epfl.ch

We study the convergence of Markov Decision Processes made of a large number of objects to optimization problems on ordinary differential equations (ODE). We show that the optimal reward of such a Markov Decision Process, satisfying a Bellman equation, converges to the solution of a continuous Hamilton-Jacobi-Bellman (HJB) equation based on the mean field approximation of the Markov Decision Process. We give bounds on the difference of the rewards, and a constructive algorithm for deriving an approximating solution to the Markov Decision Process from a solution of the HJB equations. We illustrate the method on three examples pertaining respectively to investment strategies, population dynamics control and scheduling in queues are developed. They are used to illustrate and justify the construction of the controlled ODE and to show the gain obtained by solving a continuous HJB equation rather than a large discrete Bellman equation.

Nicolas Gast, Bruno Gaujal, Jean-Yves Le Boudec

Real-time Traffic

Bellman Equation | CORR 2010 | Education | Markov Decision | Markov Decision Process |

claim paper

Related Content

» Probabilistic inference for solving discrete and continuous state Markov Decision Processe...

» SampleBased Planning for Continuous Action Markov Decision Processes

» Mean Field Variational Approximation for ContinuousTime Bayesian Networks

» A comparison of discrete and continuous output modeling techniques for a pseudo2D hidden M...

» Finite Optimal Control for TimeBounded Reachability in CTMDPs and ContinuousTime Markov Ga...

» Bayesian reinforcement learning in continuous POMDPs with gaussian processes

» Pure Stationary Optimal Strategies in Markov Decision Processes

» ValueDirected Human Behavior Analysis from Video Using Partially Observable Markov Decisio...

» Dynamic Programming for Structured Continuous Markov Decision Problems

Post Info
More Details (n/a)

Added	09 Dec 2010
Updated	09 Dec 2010
Type	Journal
Year	2010
Where	CORR
Authors	Nicolas Gast, Bruno Gaujal, Jean-Yves Le Boudec

Comments (0)