Sciweavers

223 search results - page 32 / 45
» Least-Squares Temporal Difference Learning
Sort
View
JMLR
2002
100views more  JMLR 2002»
13 years 7 months ago
On the Convergence of Optimistic Policy Iteration
We consider a finite-state Markov decision problem and establish the convergence of a special case of optimistic policy iteration that involves Monte Carlo estimation of Q-values,...
John N. Tsitsiklis
FLAIRS
2004
13 years 9 months ago
On the Pedagogically Guided Paper Recommendation for an Evolving Web-Based Learning System
In this paper we discuss the mechanism of a recommender system recommending papers for an evolving web-based learning system. Our system is unique in three aspects. The first is t...
Tiffany Ya Tang, Gordon I. McCalla
ICPR
2006
IEEE
14 years 8 months ago
Robust Recursive Learning for Foreground Region Detection in Videos with Quasi-Stationary Backgrounds
Detecting regions of interest in video sequences is the most important task in many high level video processing applications. In this paper a robust technique based on recursive l...
Alireza Tavakkoli, George Bebis, Mircea Nicolescu
AAAI
2006
13 years 9 months ago
Sample-Efficient Evolutionary Function Approximation for Reinforcement Learning
Reinforcement learning problems are commonly tackled with temporal difference methods, which attempt to estimate the agent's optimal value function. In most real-world proble...
Shimon Whiteson, Peter Stone
CVPR
2009
IEEE
15 years 2 months ago
Learning sign language by watching TV (using weakly aligned subtitles)
The goal of this work is to automatically learn a large number of British Sign Language (BSL) signs from TV broadcasts. We achieve this by using the supervisory information avai...
Patrick Buehler (University of Oxford), Mark Everi...