This paper formalizes Feature Selection as a Reinforcement Learning problem, leading to a provably optimal though intractable selection policy. As a second contribution, this pape...
We develop an online algorithm called Component Hedge for learning structured concept classes when the loss of a structured concept sums over its components. Example classes inclu...
Wouter M. Koolen, Manfred K. Warmuth, Jyrki Kivine...
Policy gradient approaches are a powerful instrument for learning how to interact with the environment. Existing approaches have focused on propositional and continuous domains on...
We consider the problem of clustering in its most basic form where only a local metric on the data space is given. No parametric statistical model is assumed, and the number of cl...
This paper deals with the problem of making predictions in the online mode of learning where the dependence of the outcome yt on the signal xt can change with time. The Aggregating...