We present metric?? , a provably near-optimal algorithm for reinforcement learning in Markov decision processes in which there is a natural metric on the state space that allows t...
We present a general Bayesian framework for hyperparameter tuning in L2-regularized supervised learning models. Paradoxically, our algorithm works by first analytically integratin...
Multiarmed bandit problem is a typical example of a dilemma between exploration and exploitation in reinforcement learning. This problem is expressed as a model of a gambler playi...
The goal of transfer learning is to use the knowledge acquired in a set of source tasks to improve performance in a related but previously unseen target task. In this paper, we pr...
Manu Sharma, Michael P. Holmes, Juan Carlos Santam...
Most of supervised learning algorithms assume the stability of the target concept over time. Nevertheless in many real-user modeling systems, where the data is collected over an ex...