Field programmable gate arrays (FPGAs), graphics processing units (GPUs) and Sony's PlayStation 2 vector units offer scope for hardware acceleration of applications. We compare the performance of these architectures using a unified description based on A Stream Compiler (ASC) for FPGAs, which has been extended to target GPUs and PS2 vector units. Programming these architectures from a single description enables us to reason about optimizations for the different architectures. Using the ASC description we implement a Montecarlo simulation, a Fast Fourier Transform (FFT) and a weighted sum algorithm. Our results show that without much optimization the GPU is suited to the Montecarlo simulation, while the weighted sum is better suited to PS2 vector units. FPGA implementations benefit particularly from architecture specific optimizations which ASC allows us to easily implement by adding simple annotations to the shared code.
Lee W. Howes, Paul Price, Oskar Mencer, Olav Beckm