A major challenge of applying profile-based optimization on large real-world applications is how to capture adequate profile information. A large program, especially a GUI-based a...
In today's wide-issue processors, even small branch-misprediction rates introduce substantial performance penalties. Worse yet, inadequate branch prediction creates a bottlen...
Kevin Skadron, Margaret Martonosi, Douglas W. Clar...
How much do two profiles of the same program differ? When has a profile changed enough to warrant reexamination of the profiled program? And how should two or more profiles be com...
We identify that a set of multimedia applications exhibit highly regular read-after-read (RAR) and read-after-write (RAW) memory dependence streams. We exploit this regularity to ...
Register file access time represents one of the critical delays of current microprocessors, and it is expected to become more critical as future processors increase the instructio...
This paper describes and evaluates the profile-based optimizations in the Compaq C compiler tool chain for Alpha. The optimizations include superblock formation, inlining, command...
Load latency remains a signi cant bottleneck in dynamically scheduled pipelined processors. Load speculation techniques have been proposed to reduce this latency. Dependence Predi...