Abstract. In the multiarmed bandit problem, a gambler must decide which arm of K nonidentical slot machines to play in a sequence of trials so as to maximize his reward. This class...
Abstract. We investigate the generalization behavior of sequential prediction (online) algorithms, when data are generated from a probability distribution. Using some newly develop...
In this article, I will consider Markov Decision Processes with two criteria, each defined as the expected value of an infinite horizon cumulative return. The second criterion is e...
The scores returned by support vector machines are often used as a confidence measures in the classification of new examples. However, there is no theoretical argument sustaining ...
In this work we consider the problem of universal prediction of individual sequences where the universal predictor is a deterministic finite state machine, with a fixed, relativel...