Sciweavers

AIPS
2011
12 years 11 months ago
Heuristic Search for Generalized Stochastic Shortest Path MDPs
Research in efficient methods for solving infinite-horizon MDPs has so far concentrated primarily on discounted MDPs and the more general stochastic shortest path problems (SSPs...
Andrey Kolobov, Mausam, Daniel S. Weld, Hector Gef...
CORR
2010
Springer
127views Education» more  CORR 2010»
13 years 7 months ago
Mean field for Markov Decision Processes: from Discrete to Continuous Optimization
We study the convergence of Markov Decision Processes made of a large number of objects to optimization problems on ordinary differential equations (ODE). We show that the optimal...
Nicolas Gast, Bruno Gaujal, Jean-Yves Le Boudec
ICML
2010
IEEE
13 years 8 months ago
Nonparametric Return Distribution Approximation for Reinforcement Learning
Standard Reinforcement Learning (RL) aims to optimize decision-making rules in terms of the expected return. However, especially for risk-management purposes, other criteria such ...
Tetsuro Morimura, Masashi Sugiyama, Hisashi Kashim...