Sciweavers

166

ICML
2010
IEEE

219views Machine Learning» more ICML 2010»

Convergence of Least Squares Temporal Difference Methods Under General Conditions

15 years 7 months ago

We consider approximate policy evaluation for finite state and action Markov decision processes (MDP) in the off-policy learning context and with the simulation-based least square...

Huizhen Yu

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers