Modern computer architectures increasingly depend on mechanisms that estimate future control flow decisions to increase performance. Mechanisms such as speculative execution and p...
Sparse matrix-vector multiplication is an important kernel that often runs inefficiently on superscalar RISC processors. This paper describes techniques that increase instruction-...
As the performance gap between the CPU and main memory continues to grow, techniques to hide memory latency are essential to deliver a high performance computer system. Prefetchin...
This paper presents COBRA (Continuous Binary ReAdaptation), a runtime binary optimization framework, for multithreaded applications. It is currently implemented on Itanium 2 based...
Single-event upsets from particle strikes have become a key challenge in microprocessor design. Techniques to deal with these transient faults exist, but come at a cost. Designers...
Shubhendu S. Mukherjee, Christopher T. Weaver, Joe...