—While many-core accelerator architectures, such as today’s Graphics Processing Units (GPUs), offer orders of magnitude more raw computing power than contemporary CPUs, their m...
Aaron Ariel, Wilson W. L. Fung, Andrew E. Turner, ...
Although it is increasingly difficult for large scientific programs to attain a significant fraction of peak performance on systems based on microprocessors with substantial instr...
John M. Mellor-Crummey, Robert J. Fowler, Gabriel ...
The ability of fast similarity search at large scale is of great importance to many Information Retrieval (IR) applications. A promising way to accelerate similarity search is sem...
Tiling is a crucial loop transformation for generating high performance code on modern architectures. Efficient generation of multilevel tiled code is essential for maximizing da...
Albert Hartono, Muthu Manikandan Baskaran, C&eacut...
Abstract. Understanding and controlling program behavior is a challenging objective for the design of advanced compilers and critical system development. In this paper, we propose ...