In this paper, we propose a hardware/software partitioning method for improving applications’ performance in embedded systems. Critical software parts are accelerated on hardware of a single-chip generic system comprised by an embedded processor and coarse-grain reconfigurable hardware. The reconfigurable hardware is realized by a 2-Dimensional array of Processing Elements. The partitioning flow utilizes an analysis procedure at the basic-block level for detecting kernels in software. A list-based mapping algorithm has been developed for estimating the execution cycles of kernels on Coarse-Grain Reconfigurable Arrays. The proposed partitioning flow has been largely automated for a program description in C language. Extensive hardware/software experiments on five real-life applications are presented. It is shown that the benchmarks spend an average of 69% of their instruction count in 11% on average of their code that correspond to the kernels’ code. The results illustrate that by ...
Michalis D. Galanis, Grigoris Dimitroulakos, Costa