As feature sizes decrease and chip sizes increase, the area and performance of chips become dominated by the interconnect. In spite of this trend, most existing synthesis systems ...
This paper describes a compiler for stream programs that efficiently schedules computational kernels and stream memory operations, and allocates on-chip storage. Our compiler uses...
Abstract – In this paper, we propose a modeling approach for the average power consumption of macro-blocks that are typically used in digital signal processing (DSP) systems, suc...
The increasing numbers of cores, shared caches and memory nodes within machines introduces a complex hardware topology. High-performance computing applications now have to carefull...
It has been widely recognized that the dynamic range information of an application can be exploited to reduce the datapath bitwidth of either processors or ASICs, and therefore th...