Program execution traces are frequently used in industry and academia. Yet, most trace-compression algorithms have to be re-implemented every time the trace format is changed, whi...
In this paper, we describe how to extend the concept of superword-level parallelization (SLP), used for multimedia extension architectures, so that it can be applied in the presen...
Software code caches are becoming ubiquitous, in dynamic optimizers, runtime tool platforms, dynamic translators, fast simulators and emulators, and dynamic compilers. Caching fre...
To improve performance and reduce power, processor designers employ advances that shrink feature sizes, lower voltage levels, reduce noise margins, and increase clock rates. Howev...
George A. Reis, Jonathan Chang, Neil Vachharajani,...
Modern embedded microprocessors use low power on-chip memories called scratch-pad memories to store frequently executed instructions and data. Unlike traditional caches, scratch-p...
Rajiv A. Ravindran, Pracheeti D. Nagarkar, Ganesh ...
The growing complexity of modern processors has made the generation of highly efficient code increasingly difficult. Manual code generation is very time consuming, but it is oft...