Developing an optimizing compiler for a newly proposed architecture is extremely difficult when there is only a simulator of the machine available. Designing such a compiler requ...
John Cavazos, Christophe Dubach, Felix V. Agakov, ...
In this paper, we present an efficient methodology to validate high performance algorithms and prototype them using reconfigurable hardware. We follow a strict topdown Hardware/So...
Klaus Buchenrieder, Andreas Pyttel, Alexander Sedl...
The management of shared caches in multicore processors is a critical and challenging task. Many hardware and OS-based methods have been proposed. However, they may be hardly adop...
Today’s general-purpose processors are increasingly using multithreading in order to better leverage the additional on-chip real estate available with each technology generation...
Ali El-Moursy, Rajeev Garg, David H. Albonesi, San...
This paper explores an application-specific customization technique for the data cache, one of the foremost area/power consuming and performance determining microarchitectural fea...