This report discusses some design and implementation issues in the HPJava language. Through example codes, we will illustrate how various language features have been designed to f...
Techniques for aggressive optimization and parallelization of applications can have the side-effect of introducing copy instructions, register-to-register move instructions, into t...
The automatic parallelization of C has always been frustrated by pointer arithmetic, irregular control flow and complicated data aggregation. Each of these problems is similar to f...
Although SIMD (Single Instruction stream Multiple Data stream) parallel computers have existed for decades, it is only in the past few years that a new version of SIMD has evolved...
We present a cache locality optimization technique that can optimize a loop nest even if the arrays referenced have different layouts in memory. Such a capability is required for a...
Mahmut T. Kandemir, J. Ramanujam, Alok N. Choudhar...
There is a class of sparse matrix computations, such as direct solvers of systems of linear equations, that change the fill-in (nonzero entries) of the coefficient matrix, and invo...
: In this paper, theoretical aspects to demonstrate the accuracy of the Interval Test (the I test and the direction vector I test) to be applied for resolving the problem stated ab...