A key step in program optimization is the determination of optimal values for code optimization parameters such as cache tile sizes and loop unrolling factors. One approach, which...
This paper presents preliminary efforts to develop compilation and execution environments that achieve performance portability of multilevel parallelization on hierarchical archit...
Walden Ko, Mark N. Yankelevsky, Dimitrios S. Nikol...
Multimedia instruction set extensions have become a prominent feature in desktop microprocessor platforms, promising superior performance on a wide range of floating-point and int...
A Zero Overhead Loop Buffer (ZOLB) is an architectural feature that is commonly found in DSP processors. This buffer can be viewed as a compiler managed cache that contains a sequ...
Gang-Ryung Uh, Yuhong Wang, David B. Whalley, Sanj...
Abstract. Increasingly threading has become an important architectural component of programming languages to support parallel programming. Previously we have proposed an elegant la...