When address reference streams exhibit high degrees of spatial and temporal locality, many of the higher order address lines carry redundant information. By caching the higher order portions of address references in a set of dynamically allocated base registers, it becomes possible to transmit small register indices between the processor and memory instead of the high order address bits themselves. Trace driven simulations indicate that this technique can significantly reduce processor-to-memory address bus width without an appreciable loss in performance, thereby increasing available processor bandwidth. Our results imply that as much as 25% of the available I/O bandwidth of a processor is used less than 1% of the time.
Matthew K. Farrens, Arvin Park