We study on-line decision problems where the set of actions that are available to the decision algorithm vary over time. With a few notable exceptions, such problems remained larg...
Robert D. Kleinberg, Alexandru Niculescu-Mizil, Yo...
External regret compares the performance of an online algorithm, selecting among N actions, to the performance of the best of those actions in hindsight. Internal regret compares ...
The analysis of online least squares estimation is at the heart of many stochastic sequential decision-making problems. We employ tools from the self-normalized processes to provi...
This work deals with four classical prediction settings, namely full information, bandit, label efficient and bandit label efficient as well as four different notions of regret: p...
Abstract. Algorithm selection is typically based on models of algorithm performance learned during a separate offline training sequence, which can be prohibitively expensive. In r...