Sciweavers

114 search results - page 8 / 23
» Loop Parallelization in the Polytope Model
Sort
View
ICS
1992
Tsinghua U.
13 years 11 months ago
Optimizing for parallelism and data locality
Previous research has used program transformation to introduce parallelism and to exploit data locality. Unfortunately,these twoobjectives have usuallybeen considered independentl...
Ken Kennedy, Kathryn S. McKinley
PPOPP
2005
ACM
14 years 1 months ago
Performance modeling and optimization of parallel out-of-core tensor contractions
The Tensor Contraction Engine (TCE) is a domain-specific compiler for implementing complex tensor contraction expressions arising in quantum chemistry applications modeling elect...
Xiaoyang Gao, Swarup Kumar Sahoo, Chi-Chung Lam, J...
MASCOTS
2010
13 years 9 months ago
Efficient Discovery of Loop Nests in Execution Traces
Execution and communication traces are central to performance modeling and analysis. Since the traces can be very long, meaningful compression and extraction of representative beha...
Qiang Xu, Jaspal Subhlok, Nathaniel Hammen
CGO
2007
IEEE
14 years 1 months ago
Iterative Optimization in the Polyhedral Model: Part I, One-Dimensional Time
Emerging microprocessors offer unprecedented parallel computing capabilities and deeper memory hierarchies, increasing the importance of loop transformations in optimizing compile...
Louis-Noël Pouchet, Cédric Bastoul, Al...
ICPPW
2006
IEEE
14 years 1 months ago
Multidimensional Dataflow-based Parallelization for Multimedia Instruction Set Extensions
In retargeting loop-based code for multimedia instruction set extensions, a critical issue is that vector data types of mixed precision within a loop body complicate the paralleli...
Lewis B. Baumstark Jr., Linda M. Wills