Compiler technology for multimedia extensions must effectively utilize not only the SIMD compute engines but also the various levels of the memory hierarchy: superword registers,...
Chun Chen, Jaewook Shin, Shiva Kintali, Jacqueline...
Abstract— Design variability due to within-die and die-todie process variations has the potential to significantly reduce the maximum operating frequency and the effective yield...
In this paper, we describe a compilation system that automates much of the process of performance tuning that is currently done manually by application programmers interested in h...
Nastaran Baradaran, Jacqueline Chame, Chun Chen, P...
Abstract. Performance understanding and prediction are extremely important goals for guiding the application of program optimizations or in helping programmers focus their efforts...
Abstract. Empirical optimizers like ATLAS have been very effective in optimizing computational kernels in libraries. The best choice of parameters such as tile size and degree of l...