It is well-known that, given a probability distribution over n characters, in the worst case it takes (n log n) bits to store a prefix code with minimum expected codeword length. However, in this paper we first show that, for any with 0 < < 1/2 and 1/ = O(polylog(n)), it takes O(n log log(1/ )) bits to store a prefix code with expected codeword length within an additive of the minimum. We then show that, for any constant c > 1, it takes O n1/c log n bits to store a prefix code with expected codeword length at most c times the minimum. In both cases, our data structures allow us to encode and decode any character in O(1) time.