—During post-silicon processor debugging, we need to frequently capture and dump out the internal state of the processor. Since internal state constitutes all memory elements, the bulk of which is composed of cache, the problem is essentially that of transferring cache contents off-chip, to a logic analyzer. In order to reduce the transfer time and save expensive logic analyzer memory, we propose to compress the cache contents on their way out. We present a hardware compression engine for cache data using a Cache Aware Compression strategy that exploits knowledge of the cache fields and their behavior to achieve an effective compression. Experimental results indicate that the technique results in 7-31% better compression than one that treats the data as just one long bit stream. We also describe and evaluate a parallel compression architecture that uses multiple compression engines, resulting in a 54% reduction in transfer time.
Anant Vishnoi, Preeti Ranjan Panda, M. Balakrishna