We present a fine-grain dynamic instruction placement algorithm for small L0 scratch-pad memories (spms), whose unit of transfer can be an individual instruction. Our algorithm ca...
Using FPGAs to accelerate High Performance Computing (HPC) applications is attractive, but has a huge associated cost: the time spent, not for developing efficient FPGA code but fo...
This paper addresses the problem of allocating (assigning and scheduling) periodic task modules to processing nodes in distributed real-time systems subject to task precedence and ...
Bank locality can be defined as localizing the number of load/store accesses to a small set of memory banks at a given time. An optimizing compiler can modify a given input code t...
Guilin Chen, Mahmut T. Kandemir, Hendra Saputra, M...
We developed a new modular synthesis approach for design of low-power core-based data-intensive application-specific systems on silicon. The power optimization is conducted in th...