In this paper, we present the analysis, design and implementation of an estimator to realize large bit width unsigned integer multiplier units. Larger multiplier units are require...
Gang Quan, James P. Davis, Siddhaveerasharan Devar...
Modern embedded compute platforms increasingly contain both microprocessors and field-programmable gate arrays (FPGAs). The FPGAs may implement accelerators or other circuits to s...
Reconfigurability is becoming an important part of System-on-Chip (SoC) design to cope with the increasing demands for simultaneous flexibility and computational power. Current ha...
Optimal network performance is critical to efficient parallel scaling for communication-bound applications on large machines. With wormhole routing, no-load latencies do not increa...
Abhinav Bhatele, Eric J. Bohm, Laxmikant V. Kal&ea...
Efficient partitioning of parallel loops plays a critical role in high performance and efficient use of multiprocessor systems. Although a significant amount of work has been don...
Arun Kejariwal, Alexandru Nicolau, Utpal Banerjee,...