Today’s general-purpose processors are increasingly using multithreading in order to better leverage the additional on-chip real estate available with each technology generation. Simultaneous Multi-Threading (SMT) was originally proposed as a large dynamic superscalar processor with monolithic hardware structures shared among all threads. Intel’s Hyper-Threaded Pentium 4 processor partitions the queue structures among two threads, demonstrating more balanced performance by reducing the hoarding of structures by a single thread. IBM’s Power5 processor is a 2-way Chip Multiprocessor (CMP) of SMT processors, each supporting 2 threads, which significantly reduces design complexity and can improve power efficiency. This paper examines processor partitioning options for larger numbers of threads on a chip. While growing transistor budgets permit four and eight-thread processors to be designed, design complexity, power dissipation, and wire scaling limitations create significant bar...
Ali El-Moursy, Rajeev Garg, David H. Albonesi, San