Abstract. Technological advances and increasingly complex and dynamic application behavior argue for revisiting mechanisms that adapt logical cache block size to application charac...
Matthew A. Watkins, Sally A. McKee, Lambert Schael...
Code compression is important in embedded systems design since it reduces the code size (memory requirement) and thereby improves overall area, power and performance. Existing res...
Tarantula is an aggressive floating point machine targeted at technical, scientific and bioinformatics workloads, originally planned as a follow-on candidate to the EV8 processo...
Roger Espasa, Federico Ardanaz, Julio Gago, Roger ...
Runahead execution is a technique that improves processor performance by pre-executing the running application instead of stalling the processor when a long-latency cache miss occ...
Branches that depend directly or indirectly on load instructions are a leading cause of mispredictions by state-of-the-art branch predictors. For a branch of this type, there is a...