This paper presents a new GPU-based tensor voting implementation which achieves significant performance improvement over the conventional CPU-based implementation. Although the t...
It is well known that LDPC decoding is computationally demanding and one of the hardest signal operations to parallelize. Beyond data dependencies that restrict the decoding of a ...
While general-purpose homogeneous multi-core architectures are becoming ubiquitous, there are clear indications that, for a number of important applications, a better performance/p...
– Efficient implementations of the Discrete Fourier Transform (DFT) for GPUs provide good performance with large data sizes, but are not competitive with CPU code for small data ...
Field programmable gate arrays (FPGAs), graphics processing units (GPUs) and Sony’s Playstation 2 vector units offer scope for hardware acceleration of applications. Implementin...
Lee W. Howes, Paul Price, Oskar Mencer, Olav Beckm...