: The architecture of the IBM Cell BE processor represents a new approach for designing CPUs. The fast execution of legacy software has to stand back in order to achieve very high ...
Timo Schneider, Torsten Hoefler, Simon Wunderlich,...
We present an auto-tuning approach to optimize application performance on emerging multicore architectures. The methodology extends the idea of searchbased performance optimizatio...
Samuel Williams, Jonathan Carter, Leonid Oliker, J...
Linear algebra algorithms are fundamental to many computing applications. Modern GPUs are suited for many general purpose processing tasks and have emerged as inexpensive high per...
As multicore systems continue to gain ground in the High Performance Computing world, linear algebra algorithms have to be reformulated or new algorithms have to be developed in or...
Alfredo Buttari, Julien Langou, Jakub Kurzak, Jack...
Several methods have been proposed in the literature for the local enumeration of dense references for arrays distributed by the CYCLIC(k) data-distributionin High Performance For...
Gerardo Bandera, Pablo P. Trabado, Emilio L. Zapat...