Re-Pair is a dictionary-based compression method invented in 1999 by Larsson and Moffat. Although its practical performance has been established through experiments, the method has resisted all attempts of formal analysis. In this paper we show that Re-Pair compresses a sequence T[1, n] over an alphabet of size and k-th order entropy Hk, to at most 2Hk + o(n log ) bits, for any k = o(log n).
Gonzalo Navarro, Luís M. S. Russo