— We study the delays faced by instructions in the pipeline of a superscalar processor and its impact on power and performance. Instructions that are ready-on-dispatch (ROD) are normally delayed in the issue stage due to resource constraints even though their data dependencies are satisfied. These delays are reduced by issuing ROD instructions earlier than normal and executing them on slow functional units to obtain power benefits. This scheme achieves around 6% to 8% power reduction with average performance degradation of about 2%. Alternatively, instead of reducing the delays faced by instructions in the pipeline, increasing them by deliberately stalling certain instructions at appropriate times minimizes the duration for which the processor is underutilized leading to 2.5-4% power savings with less than 0.3% performance degradation.
G. Surendra, Subhasis Banerjee, S. K. Nandy