target policy | Sciweavers

192

ICML
2010
IEEE

231views Machine Learning» more ICML 2010»

Toward Off-Policy Learning Control with Function Approximation

15 years 7 months ago

We present the first temporal-difference learning algorithm for off-policy control with unrestricted linear function approximation whose per-time-step complexity is linear in the ...

Hamid Reza Maei, Csaba Szepesvári, Shalabh ...

claim paper

Read More »

186

click to vote

CASSIS
2005
Springer

142views Human Computer Interaction» more CASSIS 2005»

Mobile Resource Guarantees and Policies

16 years 3 days ago

Download homepages.inf.ed.ac.uk

This paper introduces notions of resource policy for mobile code to be run on smart devices, to integrate with the proof-carrying code architecture of the Mobile Resource Guarantee...

David Aspinall, Kenneth MacKenzie

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers