We present the first temporal-difference learning algorithm for off-policy control with unrestricted linear function approximation whose per-time-step complexity is linear in the ...
We deploy a novel Reinforcement Learning optimization technique based on afterstates learning to determine the gain that can be achieved by incorporating movement prediction inform...
This paper describesFido, a predictive cache [Palmer 19901that prefetchesby employing an associativememoryto recognizeaccesspatterns within a context over time. Repeatedtraining a...
Most work on Predictive Representations of State (PSRs) has focused on learning and planning in unstructured domains (for example, those represented by flat POMDPs). This paper e...
David Wingate, Vishal Soni, Britton Wolfe, Satinde...
Abstract--A prominent emerging theory of sensorimotor development in biological systems proposes that control knowledge is encoded in the dynamics of physical interaction with the ...