Performance obtained with existing library-based parallelization tools for implementing high performance image processing applications is often sub-optimal. This is because inter-operation optimization (or: optimization across library calls) is often not incorporated in the library implementations. This paper presents a simple, efficient, finite state machine-based method for global performance optimization, called ’lazy parallelization’. Experimental results based on this approach show significant performance improvements over non-optimized parallel implementations.
Frank J. Seinstra, Dennis Koelma