In many scientific applications, significant time is spent tuning codes for a particular highperformance architecture. Tuning approaches range from the relatively nonintrusive (...
Albert Hartono, Boyana Norris, Ponnuswamy Sadayapp...
Network performance in tightly-coupled multiprocessors typically degrades rapidly beyond network saturation. Consequently, designers must keep a network below its saturation point...
Mithuna Thottethodi, Alvin R. Lebeck, Shubhendu S....
Automated code generation and performance tuning techniques for concurrent architectures such as GPUs, Cell and FPGAs can provide integer factor speedups over multi-core processor...
Timely and cost-effective processing of large datasets has become a critical ingredient for the success of many academic, government, and industrial organizations. The combination...
—This paper describes the application of various search techniques to the problem of automatic empirical code optimization. The search process is a critical aspect of auto-tuning...