A simple image-processing application is implemented on the Ambric MPPA and an FPGA, using a similar implementation for both devices. FPGAs perform extremely well on this kind of application and provide a good benchmark for comparison. The Ambric implementation starts out with a naive implementation and proceeds through several design optimizations until it reaches a maximum frame rate of 164 FPS (512 x 512 images) which turns out to be approximately 7x slower than the FPGA. The final Ambric implementation uses only 18 of 336 available processors, achieves more than sufficient performance for realtime embedded applications, and has excess processors to use for implementing additional algorithms. After introducing the image processing application and its implementation on both devices, the paper compares and contrasts the intrinsic, general characteristics of Ambric MPPA and FPGA devices.
Brad L. Hutchings, Brent E. Nelson, Stephen West,