We propose dynamic scheduler designs to improve the scheduler scalability and reduce its complexity in the SMT processors. Our first design is an adaptation of the recently proposed instruction packing to SMT. Instruction packing opportunistically packs two instructions (possibly from different threads), each with at most one non-ready source operand at the time of dispatch, into the same issue queue entry. Our second design, termed 2OP_BLOCK, takes these ideas one step further and completely avoids the dispatching of the instructions with two non-ready source operands. This technique has several advantages. First, it reduces the scheduling complexity (and the associated delays) as the logic needed to support the instructions with 2 non-ready source operands is eliminated. More surprisingly, 2OP_BLOCK simultaneously improves the performance as the same issue queue entry may be reallocated multiple times to the instructions with at most one non-ready source (which usually spend fewer c...
Joseph J. Sharkey, Dmitry V. Ponomarev