Due to the high-dimensionality of motion captured data which resulted in the complexity in motion analysis, a method of motion data processing based on manifold learning was propos...
We introduce the first algorithm for off-policy temporal-difference learning that is stable with linear function approximation. Off-policy learning is of interest because it forms...
Nonlinear dimensionality reduction methods are often used to visualize high-dimensional data, although the existing methods have been designed for other related tasks such as mani...
Jarkko Venna, Jaakko Peltonen, Kristian Nybo, Hele...
Most learning algorithms for undirected graphical models require complete inference over at least one instance before parameter updates can be made. SampleRank is a rankbased lear...
Sameer Singh, Limin Yao, Sebastian Riedel, Andrew ...
In this paper, we investigate the use of parallelization in reinforcement learning (RL), with the goal of learning optimal policies for single-agent RL problems more quickly by us...