We present a unified framework for reasoning about worst-case regret bounds for learning algorithms. This framework is based on the theory of duality of convex functions. It brings together results from computational learning theory and Bayesian statistics, allowing us to derive new proofs of known theorems, new theorems about known algorithms, and new algorithms. 1 The inference problem We are interested in the following kind of inference problem: on each time step t = 1:::T we must choose a prediction vector wt from a set of allowable predictions W. The interpretation of wt depends on the details of the problem, but for example wt might be our guess at the mean of a sequence of numbers or the coefficients of a linear regression. Then the loss function ltw is revealed to us, and we are penalized ltwt. These penalties are additive, so our overall goal is to minimize PT t=1 ltwt. Our choice of wt may depend on l1 :::lt,1, and possibly on some additional prior information, but i...
Geoffrey J. Gordon