In retargeting loop-based code for multimedia instruction set extensions, a critical issue is that vector data types of mixed precision within a loop body complicate the parallelization process since corresponding array elements are misaligned in the packed vectors. This paper presents a reverseengineering approach to parallelization which extracts from the source code a multidimensional dataflow graph representation with explicit parallel semantics. The multidimensional annotations facilitate generating vector data type conversion code during code synthesis. This representation is independent of sequential artifacts, allowing code synthesis to proceed an abstract data-parallel model of the program and the constraints imposed by the architecture, such as vector length and available data types. Our results show that this representation facilitates parallelization of a wider range of loops than traditional vectorization. The results of this parallelization indicate loop speedups of 2 to...
Lewis B. Baumstark Jr., Linda M. Wills