In this paper, an efficient implementation of a high performance coarse-grain reconfigurable data-path on a mixed-granularity reconfigurable platform is presented. It consists of several coarse grain components of the same type, a reconfigurable inter-component network, and a centralized register bank. The universal type of coarse grain component is shown to increase the system’s performance due to significant reductions in the latency. A flexible interconnection network facilitates the data transfers between the coarse grain components and also from or to the register bank. An automated methodology for mapping DSP and multimedia kernels on the data-path is also presented. Chaining of operations is optimally exploited, and the architecture allows for simple and efficient algorithms for scheduling, live signal reduction, and component binding. Experimental results verify the impact of our architectural decisions and design automation methods.
Michalis D. Galanis, George Theodoridis, Spyros Tr