Application specific processors offer the potential of rapidly designed logic specifically constructed to meet the performance and area demands of the task at hand. Recently, t...
This paper presents a technique called “workload decomposition” in which the CPU workload is decomposed in two parts: on-chip and off-chip. The on-chip workload signifies the ...
The speedup over a microprocessor that can be achieved by implementing some programs on an FPGA has been extensively reported. This paper presents an analysis, both quantitative a...
Zhi Guo, Walid A. Najjar, Frank Vahid, Kees A. Vis...
Abstract. Empirical optimizers like ATLAS have been very effective in optimizing computational kernels in libraries. The best choice of parameters such as tile size and degree of l...
Abstract Fabio Massacci1 and Nicola Zannone1 Department of Information and Communication Technology University of Trento - Italy {massacci,zannone} at dit.unitn.it The last years h...