Scheduling and partitioning of task graphs on reconfigurable hardware needs to be carefully carried out in order to achieve the best possible performance. In this paper, we demonstrate that a significant improvement to the total execution time is possible by incorporating a library of hardware task implementations, which contains multiple architectural variants for each hardware task reflecting tradeoffs between the resources utilization and the task execution throughput. We develop a genetic algorithm based mapping approach, which considers both task graph and target platform, and present results for an N-body simulation application using estimated numbers for resource utilization for the constituent tasks and based on actual architectural constraints from different reconfigurable platforms. The results demonstrate improvements of up to 85.3% in the execution time, compared to choosing a fixed implementation variant for each task while keeping a reasonable searching time.
Miaoqing Huang, Vikram K. Narayana, Tarek A. El-Gh