Abstract. We investigate the generalization behavior of sequential prediction (online) algorithms, when data are generated from a probability distribution. Using some newly developed probability inequalities, we are able to bound the total generalization performance of a learning algorithm in terms of its observed total loss. Consequences of this analysis will be illustrated with examples.