Sciweavers

22 search results - page 3 / 5
» Automatic Partitioning of Parallel Loops with Parallelepiped...
Sort
View
ICS
2009
Tsinghua U.
14 years 3 months ago
Performance modeling and automatic ghost zone optimization for iterative stencil loops on GPUs
Iterative stencil loops (ISLs) are used in many applications and tiling is a well-known technique to localize their computation. When ISLs are tiled across a parallel architecture...
Jiayuan Meng, Kevin Skadron
CGO
2010
IEEE
14 years 3 months ago
Automatic creation of tile size selection models
Tiling is a widely used loop transformation for exposing/exploiting parallelism and data locality. Effective use of tiling requires selection and tuning of the tile sizes. This is...
Tomofumi Yuki, Lakshminarayanan Renganarayanan, Sa...
ICPP
1998
IEEE
14 years 1 months ago
A memory-layout oriented run-time technique for locality optimization
Exploiting locality at run-time is a complementary approach to a compiler approach for those applications with dynamic memory access patterns. This paper proposes a memory-layout ...
Yong Yan, Xiaodong Zhang, Zhao Zhang
ICPP
2000
IEEE
14 years 1 months ago
Partitioning Loops with Variable Dependence Distances
A new technique to parallelize loops with variable distance vectors is presented. The method extends previous methods in two ways. First, the present method makes it possible for ...
Yijun Yu, Erik H. D'Hollander
ICPP
1997
IEEE
14 years 1 months ago
Automatic Partitioning of Data and Computations on Scalable Shared Memory Multiprocessors
Abstract—This paper describes an algorithm for deriving data and computation partitions on scalable shared memory multiprocessors. The algorithm establishes affinity relationshi...
Sudarsan Tandri, Tarek S. Abdelrahman