Providing adequate data bandwidth is extremely important for a wide-issue superscalar processor to achieve its full performance potential. Adding a large number of ports to a data...
To achieve highly accurate branch prediction, it is necessary not only to allocate more resources to branch prediction hardware but also to improve the understanding of branch exe...
This paper describes Embra, a simulator for the processors, caches, and memory systems of uniprocessors and cache-coherent multiprocessors. When running as part of the SimOS simul...
The rapid advances in high-performancecomputer architectureand compilationtechniques provide both challenges and opportunitiesto exploitthe rich solution space of software pipeline...
Ramaswamy Govindarajan, Erik R. Altman, Guang R. G...
This paper presents the first automatic scheme to allocate local (stack) data in recursive functions to scratch-pad memory (SPM) in embedded systems. A scratch-pad is a fast direct...