Modern microprocessors adopt speculative scheduling techniques where instructions are scheduled several clock cycles before they actually execute. Due to this scheduling delay, scheduling misses should be recovered across the multiple levels of dependence chains in order to prevent further unnecessary execution. We explore the design space of various scheduling replay schemes that prevent the propagation of scheduling misses, and find that current and proposed replay schemes do not scale well and require instructions to execute in correct data dependence order, since they track dependences among instructions within the instruction window as a part of the scheduling or execution process. In this paper, we propose token-based selective replay that moves the dependence information propagation loop out of the scheduler, enabling lower complexity in the scheduling logic and support for data-speculation techniques at the expense of marginal IPC degradation compared to an ideal selective rep...
Ilhyun Kim, Mikko H. Lipasti