Processors that can simultaneously execute multiple paths of execution will only exacerbate the fetch bandwidth problem already plaguing conventional processors. On a multiple-path processor, which speculatively executes less likely paths of hard-to-predict branches, the work done along a speculative path is normally discarded if that path is found to be incorrect. Instead, it can be beneficial to keep these instruction traces stored in the processor for possible future use. This paper introduces instruction recycling, where previously decoded instructions from recently executed paths are injected back into the rename stage. This increases the supply of instructions to the execution pipeline and decreases fetch latency. In addition, if the operands have not changed for a recycled instruction, the instruction can bypass the issue and execution stages, benefiting from instruction reuse. Instruction recycling and reuse are examined for a simultaneous multithreading architecture with mult...
Steven Wallace, Dean M. Tullsen, Brad Calder