Instruction Level Parallelism (ILP) speedups of an order-of-magnitude or greater may be possible using the techniques described herein. Traditional speculative code execution is t...
Conventional speculative architectures use branch prediction to evaluate the most likely execution path during program execution. However, certain branches are difficult to predic...
Artur Klauser, Todd M. Austin, Dirk Grunwald, Brad...
Performance of multithreaded programs is heavily influenced by the latencies of the thread management and synchronization operations. Improving these latencies becomes especially...
Alex Gontmakher, Avi Mendelson, Assaf Schuster, Gr...
Predicated Execution can be used to alleviate the costs associated with frequently mispredicted branches. This is accomplished by trading the cost of a mispredicted branch for exe...
This paper presents the Dynamic Simultaneous Multithreaded Architecture (DSMT). DSMT efficiently executes multiple threads from a single program on a SMT processor core. To accomp...