Embedded computing architectures can be designed to meet a variety of application specific requirements. However, optimized hardware can require compiler support to realize the potential of the hardware. This is especially true for embedded image processing systems where significant architectural variation is possible, and targeted software can change drastically based on architectural variation. This paper presents methods to compile a single high-level source given a fundamental variation in data-parallel target architectures – processor granularity ranging from a single processor to a massively parallel processor array. The approach uses single PPE virtualization, which supports pixellevel data-parallel expressions that operate on a virtual one pixel per processing element (PPE) network and applies pixel-locating transformations to retarget the code into a given target PPE. Unlike mainstream parallel computing techniques, this technique can be applied to lightweight SIMD targets ...
Sam Sander, Linda M. Wills