We present a general approach to model selection and regularization that exploits unlabeled data to adaptively control hypothesis complexity in supervised learning tasks. The idea ...
T ORDER REGRESSION (EXTENDED ABSTRACT) Kurt Driessensa Saso Dzeroskib a Department of Computer Science, University of Waikato, Hamilton, New Zealand (kurtd@waikato.ac.nz) b Departm...
We present Policy Gradient Actor-Critic (PGAC), a new model-free Reinforcement Learning (RL) method for creating limited-memory stochastic policies for Partially Observable Markov ...
We study the problem of domain transfer for a supervised classification task in mRNA splicing. We consider a number of recent domain transfer methods from machine learning, includ...
Gabriele Schweikert, Christian Widmer, Bernhard Sc...
— The least squares approach works efficiently in value function approximation, given appropriate basis functions. Because of its smoothness, the Gaussian kernel is a popular an...
Masashi Sugiyama, Hirotaka Hachiya, Christopher To...