This site uses cookies to deliver our services and to ensure you get the best experience. By continuing to use this site, you consent to our use of cookies and acknowledge that you have read and understand our Privacy Policy, Cookie Policy, and Terms
In this paper, we describe a generalized approach to deriving a custom data layout in multiple memory banks for array-based computations, to facilitate high-bandwidth parallel mem...
Effective utilization of cache memories is a key factor in achieving high performance in computing the Discrete Fourier Transform (DFT). Most optimizationtechniques for computing ...
Neungsoo Park, Dongsoo Kang, Kiran Bondalapati, Vi...
On machines with high-performance processors, the memory system continues to be a performance bottleneck. Compilers insert prefetch operations and reorder data accesses to improve...
Nathaniel McIntosh, Sandya Mannarswamy, Robert Hun...
Given a data layout of a large walkthrough scene, we present a novel and simple spatial hierarchy on the disk-pages of the layout that has notable advantages over a conventional s...
Behzad Sajadi, Yan Huang, Pablo Diaz-Gutierrez, Su...