Sciweavers

217 search results - page 24 / 44
» Data Flow Equations for Explicitly Parallel Programs
Sort
View
HPCA
2008
IEEE
14 years 7 months ago
Branch-mispredict level parallelism (BLP) for control independence
A microprocessor's performance is fundamentally limited by the rate at which it can resolve branch mispredictions. Control independence (CI) architectures look for useful con...
Kshitiz Malik, Mayank Agarwal, Sam S. Stone, Kevin...
ASPLOS
2009
ACM
14 years 8 months ago
3D finite difference computation on GPUs using CUDA
In this paper we describe a GPU parallelization of the 3D finite difference computation using CUDA. Data access redundancy is used as the metric to determine the optimal implement...
Paulius Micikevicius
LCPC
1997
Springer
13 years 11 months ago
Automatic Data Decomposition for Message-Passing Machines
The data distribution problem is very complex, because it involves trade-offdecisions between minimizing communication and maximizing parallelism. A common approach towards solving...
Mirela Damian-Iordache, Sriram V. Pemmaraju
ICPP
1999
IEEE
13 years 11 months ago
A Framework for Interprocedural Locality Optimization Using Both Loop and Data Layout Transformations
There has been much work recently on improving the locality performance of loop nests in scientific programs through the use of loop as well as data layout optimizations. However,...
Mahmut T. Kandemir, Alok N. Choudhary, J. Ramanuja...
OOPSLA
2001
Springer
13 years 12 months ago
Modular Mixin-Based Inheritance for Application Frameworks
Mixin modules are proposed as an extension of a class-based programming language. Mixin modules combine parallel extension of classes, including extension of the self types for th...
Dominic Duggan, Ching-Ching Techaubol