As the speed gap between CPU and memory widens, memory hierarchy has become the primary factor limiting program performance. Until now, the principal focus of hardware and softwar...
A Zero Overhead Loop Buffer (ZOLB) is an architectural feature that is commonly found in DSP processors. This buffer can be viewed as a compiler managed cache that contains a sequ...
Gang-Ryung Uh, Yuhong Wang, David B. Whalley, Sanj...
This paper introduces a method for improving program run-time performance by gathering work in an application and executing it efficiently in an integrated thread. Our methods ext...
To fully exploit multicore processors, applications are expected to provide a large degree of thread-level parallelism. While adequate for low core counts and their typical worklo...
Symbolic techniques based on Binary Decision Diagrams (BDDs) are widely employed for reasoning about temporal properties of hardware circuits and synchronous controllers. However, ...