Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

174

ICML
2005
IEEE

135views Machine Learning» more ICML 2005»

Finite time bounds for sampling based fitted value iteration

16 years 7 months ago

Finite time bounds for sampling based fitted value iteration

Download www.machinelearning.org

In this paper we consider sampling based fitted value iteration for discounted, large (possibly infinite) state space, finite action Markovian Decision Problems where only a generative model of the transition probabilities and rewards is available. At each step the image of the current estimate of the optimal value function under a Monte-Carlo approximation to the Bellman-operator is projected onto some function space. PAC-style bounds on the weighted Lp -norm approximation error are obtained as a function of the covering number and the approximation power of the function space, the iteration number and the sample size.

Csaba Szepesvári, Rémi Munos

Real-time Traffic

Fitted Value Iteration | Function Space | ICML 2005 | Machine Learning | Optimal Value Function |

claim paper

Related Content

» FiniteTime Bounds for Fitted Value Iteration

» Generalized Sampling and Variance in Counterfactual Regret Minimization

» FiniteSample Convergence Rates for QLearning and Indirect Algorithms

» Gaussian Process Bandits for Tree Search

» LTL Path Checking Is Efficiently Parallelizable

» An Improved LPbased Approximation for Steiner Tree

Post Info
More Details (n/a)

Added	17 Nov 2009
Updated	17 Nov 2009
Type	Conference
Year	2005
Where	ICML
Authors	Csaba Szepesvári, Rémi Munos

Comments (0)