We describe heterogeneous multi-CPU and multi-GPU implementations of Jacobi’s iterative method for the 2-D Poisson equation on a structured grid, in both single- and doublepreci...
Power and energy are first-order design constraints in high performance computing. Current research using dynamic voltage scaling (DVS) relies on trading increased execution time...
Barry Rountree, David K. Lowenthal, Bronis R. de S...
Tiling is a crucial loop transformation for generating high performance code on modern architectures. Efficient generation of multilevel tiled code is essential for maximizing da...
Albert Hartono, Muthu Manikandan Baskaran, C&eacut...
The speed gap between processor and memory continues to limit performance. To address this problem, we explore the potential of eliminating Zero Loads—loads accessing memory loc...
Md. Mafijul Islam, Sally A. McKee, Per Stenstr&oum...
Single-particle 3D reconstruction from cryo-electron microscopy (cryo-EM) images is a kernel application of biological molecules analysis, as the computational requirement of whic...