As the number of cores per machine increases, memory architectures are being redesigned to avoid bus contention and sustain higher throughput needs. The emergence of Non-Uniform M...
We propose a novel design estimation method for adaptive streaming applications to be implemented on a partially reconfigurable FPGA. Based on experimental results we enable accu...
Ingo Sander, Jun Zhu, Axel Jantsch, Andreas Herrho...
Efficient intra-node shared memory communication is important for High Performance Computing (HPC), especially with the emergence of multi-core architectures. As clusters continue ...
Recent works demonstrate several benefits of synthesizing software binaries onto FPGA hardware, including incorporating hardware design into established software tool flows with m...
In high-end processors, increasing the number of in-flight instructions can improve performance by overlapping useful processing with long-latency accesses to the main memory. Buf...