Widespread use of wavelet transforms as in JPEG2000 demands efficient implementations on general purpose computers as well as dedicated hardware. The increasing availability of S...
An emerging trend in processor design is the addition of short vector instructions to general-purpose and embedded ISAs. Frequently, these extensions are employed using traditiona...
Samuel Larsen, Rodric M. Rabbah, Saman P. Amarasin...
Abstract. Traditional code optimization techniques treat loops as nonpredictable structures and do not consider expressions containing array accesses for optimization. We show that...
Iterative stencil loops (ISLs) are used in many applications and tiling is a well-known technique to localize their computation. When ISLs are tiled across a parallel architecture...
In this paper, we study the problem of scheduling parallel loops at compile-time for a heterogeneous network of machines. We consider heterogeneity in three aspects of parallel pr...