Integer division, modulo, and remainder operations are expressive and useful operations. They are logical candidates to express complex data accesses such as the wrap-around behav...
Jeffrey Sheldon, Walter Lee, Ben Greenwald, Saman ...
In this paper, we present several novel strategies to improve software controlled cache utilization, so as to achieve lower power requirements for multi-media and signal processin...
The performance of applications on large shared-memory multiprocessors with coherent caches depends on the interaction between the granularity of data sharing, the size of the coh...
Mahmut T. Kandemir, Alok N. Choudhary, J. Ramanuja...
Reuse distance (i.e. LRU stack distance) precisely characterizes program locality and has been a basic tool for memory system research since the 1970s. However, the high cost of m...
Xipeng Shen, Jonathan Shaw, Brian Meeker, Chen Din...
Matrix computation algorithms often exhibit dependencies between neighboring elements inside loop nests such that the frontier between computed elements and those to be computed w...