The memory system's large and growing contribution to system performance motivates more aggressive approaches to improving its efficiency. We propose and analyze a memory hierarchy that uses a unified compression scheme encompassing the last-level on-chip cache, the off-chip memory channel, and off-chip main memory. This scheme simultaneously increases the effective on-chip cache capacity, off-chip bandwidth, and main memory size, while avoiding compression and decompression overheads between levels. Simulations of the SPEC CPU2000 benchmarks using a 1MB cache and 128-byte blocks show an average speedup of 19%, while degrading performance by no more than 5%. The combined scheme achieves a peak improvement of 292%, compared to 165% and 83% for cache or bus compression alone. The compressed system generally provides even better performance as the block size is increased to 512 bytes.
Erik G. Hallnor, Steven K. Reinhardt