Sciweavers

379 search results - page 6 / 76
» Optimal loop parallelization for maximizing iteration-level ...
Sort
View
MICRO
2005
IEEE
130views Hardware» more  MICRO 2005»
14 years 28 days ago
Exploiting Vector Parallelism in Software Pipelined Loops
An emerging trend in processor design is the addition of short vector instructions to general-purpose and embedded ISAs. Frequently, these extensions are employed using traditiona...
Samuel Larsen, Rodric M. Rabbah, Saman P. Amarasin...
IEEEPACT
1998
IEEE
13 years 11 months ago
A Matrix-Based Approach to the Global Locality Optimization Problem
Global locality analysis is a technique for improving the cache performance of a sequence of loop nests through a combination of loop and data layout optimizations. Pure loop tran...
Mahmut T. Kandemir, Alok N. Choudhary, J. Ramanuja...
IFL
2005
Springer
107views Formal Methods» more  IFL 2005»
14 years 26 days ago
With-Loop Fusion for Data Locality and Parallelism
With-loops are versatile array comprehensions used in the functional array language SaC to implement universally applicable array operations. We describe the fusion of with-loops a...
Clemens Grelck, Karsten Hinckfuß, Sven-Bodo ...
MICRO
1995
IEEE
217views Hardware» more  MICRO 1995»
13 years 11 months ago
Improving instruction-level parallelism by loop unrolling and dynamic memory disambiguation
Exploitation ofinstruction-levelparallelism is an ejfective mechanism for improving the performance of modern super-scalar/VLIW processors. Various software techniques can be appl...
Jack W. Davidson, Sanjay Jinturkar
ICS
2000
Tsinghua U.
13 years 11 months ago
Automatic loop transformations and parallelization for Java
From a software engineering perspective, the Java programming language provides an attractive platform for writing numerically intensive applications. A major drawback hampering i...
Pedro V. Artigas, Manish Gupta, Samuel P. Midkiff,...