Abstract. While direct, model-free reinforcement learning often performs better than model-based approaches in practice, only the latter have yet supported theoretical guarantees f...
In this paper we consider the problem of policy evaluation in reinforcement learning, i.e., learning the value function of a fixed policy, using the least-squares temporal-differe...
Alessandro Lazaric, Mohammad Ghavamzadeh, Ré...
We show that random DNF formulas, random log-depth decision trees and random deterministic finite acceptors cannot be weakly learned with a polynomial number of statistical queries...
Dana Angluin, David Eisenstat, Leonid Kontorovich,...
In this paper, Multi-View Expectation and Maximization algorithm for finite mixture models is proposed by us to handle realworld learning problems which have natural feature split...
The multi-period newsvendor problem describes the dilemma of a newspaper salesman--how many paper should he purchase each day to resell, when he doesn't know the demand? We d...