It is widely known that parallel operation execution in multiprocessor systems generates a respective increase in memory accesses. Since the memory and bus subsystems provide a limited access bandwidth, the applications performance cannot be that high as the multiprocessor system capabilities promise. This is the case for the 2-Dimensional coarse-grained reconfigurable arrays for which a mapping methodology that aims in improving the mapped applications’ performance by alleviating the data bandwidth bottleneck, is presented in this paper. This is achieved by exploiting the applications’ data reuse opportunities both at the data dependence and source code level and the architecture’s foreground memory. The methodology considers a realistic 2-Dimensional coarsegrained reconfigurable architecture template, which can model the majority of the existing coarse-grained reconfigurable array architectures. The experimental results show a significant reduction in both execution time and m...
Grigoris Dimitroulakos, Michalis D. Galanis, Costa