We consider a finite-state Markov decision problem and establish the convergence of a special case of optimistic policy iteration that involves Monte Carlo estimation of Q-values,...
In this paper we discuss the mechanism of a recommender system recommending papers for an evolving web-based learning system. Our system is unique in three aspects. The first is t...
Detecting regions of interest in video sequences is the most important task in many high level video processing applications. In this paper a robust technique based on recursive l...
Reinforcement learning problems are commonly tackled with temporal difference methods, which attempt to estimate the agent's optimal value function. In most real-world proble...
The goal of this work is to automatically learn a large
number of British Sign Language (BSL) signs from TV
broadcasts. We achieve this by using the supervisory information
avai...
Patrick Buehler (University of Oxford), Mark Everi...