We describe online algorithms for learning a rotation from pairs of unit vectors in Rn . We show that the expected regret of our online algorithm compared to the best fixed rotation chosen offline is O( nL), where L is the loss of the best rotation. We also give a lower bound that proves that this expected regret bound is optimal within a constant factor. This resolves an open problem posed in COLT 2008. Our online algorithm for choosing a rotation matrix in each trial is based on the Follow-The-Perturbed-Leader paradigm. It adds a random spectral perturbation to the matrix characterizing the loss incurred so far and then chooses the best rotation matrix for that loss. We also show that any deterministic algorithm for learning rotations has (T) regret in the worst case.
Elad Hazan, Satyen Kale, Manfred K. Warmuth