Sciweavers

7 search results - page 2 / 2
» A Counterexample Guided Abstraction-Refinement Framework for...
Sort
View
ICML
2001
IEEE
14 years 8 months ago
Off-Policy Temporal Difference Learning with Function Approximation
We introduce the first algorithm for off-policy temporal-difference learning that is stable with linear function approximation. Off-policy learning is of interest because it forms...
Doina Precup, Richard S. Sutton, Sanjoy Dasgupta
IJRR
2011
218views more  IJRR 2011»
13 years 2 months ago
Motion planning under uncertainty for robotic tasks with long time horizons
Abstract Partially observable Markov decision processes (POMDPs) are a principled mathematical framework for planning under uncertainty, a crucial capability for reliable operation...
Hanna Kurniawati, Yanzhu Du, David Hsu, Wee Sun Le...