Heterogeneous or co-processor architectures are becoming an important component of high productivity computing systems (HPCS). In this work the performance of a GPU based HPCS is c...
David Huw Jones, Adam Powell, Christos-Savvas Boug...
Commercial soft processors are unable to effectively exploit the data parallelism present in many embedded systems workloads, requiring FPGA designers to exploit it (laboriously) ...
Peter Yiannacouras, J. Gregory Steffan, Jonathan R...
Iterative numerical algorithms with high memory bandwidth requirements but medium-size data sets (matrix size ∼ a few 100s) are highly appropriate for FPGA acceleration. This pap...
Abid Rafique, Nachiket Kapre, George A. Constantin...
—Escalating system-on-chip design complexity is the design community to raise the level of abstraction beyond register transfer level. Despite the unsuccessful adoptions of early...
Jason Cong, Bin Liu, Stephen Neuendorffer, Juanjo ...
Distributed Arithmetic techniques are widely used to implement Sum-of-Products computations such as calculations found in multimedia applications like FIR filtering and Discrete Co...