Several manufacturers have recently announced the first simultaneous-multithreaded processors, both as single CPUs and as components of multi-CPU chips. All are small scale, comprising only two to four thread contexts. A significant impediment to the construction of larger-scale SMTs is the register file size required by a large number of contexts. This paper introduces and evaluates minithreads, a simple extension to SMT that increases threadlevel parallelism without the commensurate increase in register file size. A mini-threaded SMT CPU adds additional per-thread state to each hardware context; an application executing in a context can create mini-threads that will utilize its own per-thread state, but share the context's architectural register set. The resulting performance will depend on the benefits of additional TLP compared to the costs of executing mini-threads with fewer registers. Our results quantify these factors in detail and demonstrate that mini-threads can improv...
Joshua Redstone, Susan J. Eggers, Henry M. Levy