In sequential decision-making problems formulated as Markov decision processes, state-value function approximation using domain features is a critical technique for scaling up the...
In reinforcement learning, least-squares temporal difference methods (e.g., LSTD and LSPI) are effective, data-efficient techniques for policy evaluation and control with linear v...
Michael H. Bowling, Alborz Geramifard, David Winga...
Learning, planning, and representing knowledge in large state t multiple levels of temporal abstraction are key, long-standing challenges for building flexible autonomous agents. ...
We present the first temporal-difference learning algorithm for off-policy control with unrestricted linear function approximation whose per-time-step complexity is linear in the ...
The detection of faces in images is fundamentally a rare event detection problem. Cascade classifiers provide an efficient computational solution, by leveraging the asymmetry in t...