Sciweavers

ICML
2008
IEEE
15 years 1 months ago
Sample-based learning and search with permanent and transient memories
We present a reinforcement learning architecture, Dyna-2, that encompasses both samplebased learning and sample-based search, and that generalises across states during both learni...
David Silver, Martin Müller 0003, Richard S. ...