This paper describes an unsupervised learning technique for modeling human locomotion styles, such as distinct related activities (e.g. running and striding) or variations of the ...
In this paper, we survey the current state-ofart models for structured learning problems, including Hidden Markov Model (HMM), Conditional Random Fields (CRF), Averaged Perceptron...
We consider the problem of choosing, sequentially, a map which assigns elements of a set A to a few elements of a set B. On each round, the algorithm suffers some cost associated ...
Alexander Rakhlin, Jacob Abernethy, Peter L. Bartl...
We consider the problem of multi-task reinforcement learning, where the agent needs to solve a sequence of Markov Decision Processes (MDPs) chosen randomly from a fixed but unknow...
Aaron Wilson, Alan Fern, Soumya Ray, Prasad Tadepa...
The UCT algorithm learns a value function online using sample-based search. The TD() algorithm can learn a value function offline for the on-policy distribution. We consider three...