We present metric?? , a provably near-optimal algorithm for reinforcement learning in Markov decision processes in which there is a natural metric on the state space that allows t...
We present a new method for transductive learning, which can be seen as a transductive version of the k nearest-neighbor classifier. Unlike for many other transductive learning me...
We investigate algebraic, logical, and geometric properties of concepts recognized by various classes of probabilistic classifiers. For this we introduce a natural hierarchy of pr...
Learning to fly an aircraft is a complex task that requires the development of control skills and goal achievement strategies. This paper presents a behavioural cloning system tha...
In this paper we present a simple to implement truly online large margin version of the Perceptron ranking (PRank) algorithm, called the OAP-BPM (Online Aggregate Prank-Bayes Poin...
Hierarchical reinforcement learning is a general framework which attempts to accelerate policy learning in large domains. On the other hand, policy gradient reinforcement learning...