Sciweavers

152 search results - page 31 / 31
» A game-based abstraction-refinement framework for Markov dec...
Sort
View
ICML
2001
IEEE
14 years 8 months ago
Off-Policy Temporal Difference Learning with Function Approximation
We introduce the first algorithm for off-policy temporal-difference learning that is stable with linear function approximation. Off-policy learning is of interest because it forms...
Doina Precup, Richard S. Sutton, Sanjoy Dasgupta
INFOCOM
2009
IEEE
14 years 2 months ago
Delay-Optimal Opportunistic Scheduling and Approximations: The Log Rule
—This paper considers the design of opportunistic packet schedulers for users sharing a time-varying wireless channel from the performance and the robustness points of view. Firs...
Bilal Sadiq, Seung Jun Baek, Gustavo de Veciana