Sciweavers

379 search results - page 6 / 76
» Optimal loop parallelization for maximizing iteration-level ...
Sort
View
119
Voted
MICRO
2005
IEEE
130views Hardware» more  MICRO 2005»
15 years 8 months ago
Exploiting Vector Parallelism in Software Pipelined Loops
An emerging trend in processor design is the addition of short vector instructions to general-purpose and embedded ISAs. Frequently, these extensions are employed using traditiona...
Samuel Larsen, Rodric M. Rabbah, Saman P. Amarasin...
130
Voted
IEEEPACT
1998
IEEE
15 years 7 months ago
A Matrix-Based Approach to the Global Locality Optimization Problem
Global locality analysis is a technique for improving the cache performance of a sequence of loop nests through a combination of loop and data layout optimizations. Pure loop tran...
Mahmut T. Kandemir, Alok N. Choudhary, J. Ramanuja...
123
Voted
IFL
2005
Springer
107views Formal Methods» more  IFL 2005»
15 years 8 months ago
With-Loop Fusion for Data Locality and Parallelism
With-loops are versatile array comprehensions used in the functional array language SaC to implement universally applicable array operations. We describe the fusion of with-loops a...
Clemens Grelck, Karsten Hinckfuß, Sven-Bodo ...
170
Voted
MICRO
1995
IEEE
217views Hardware» more  MICRO 1995»
15 years 6 months ago
Improving instruction-level parallelism by loop unrolling and dynamic memory disambiguation
Exploitation ofinstruction-levelparallelism is an ejfective mechanism for improving the performance of modern super-scalar/VLIW processors. Various software techniques can be appl...
Jack W. Davidson, Sanjay Jinturkar
144
Voted
ICS
2000
Tsinghua U.
15 years 6 months ago
Automatic loop transformations and parallelization for Java
From a software engineering perspective, the Java programming language provides an attractive platform for writing numerically intensive applications. A major drawback hampering i...
Pedro V. Artigas, Manish Gupta, Samuel P. Midkiff,...