Sciweavers

379 search results - page 9 / 76
» Optimal loop parallelization for maximizing iteration-level ...
Sort
View
PPOPP
2005
ACM
14 years 28 days ago
Performance modeling and optimization of parallel out-of-core tensor contractions
The Tensor Contraction Engine (TCE) is a domain-specific compiler for implementing complex tensor contraction expressions arising in quantum chemistry applications modeling elect...
Xiaoyang Gao, Swarup Kumar Sahoo, Chi-Chung Lam, J...
IEEEPACT
2007
IEEE
14 years 1 months ago
Automatic Correction of Loop Transformations
Loop nest optimization is a combinatorial problem. Due to the growing complexity of modern architectures, it involves two increasingly difficult tasks: (1) analyzing the profita...
Nicolas Vasilache, Albert Cohen, Louis-Noël P...
IEEEPACT
2002
IEEE
14 years 9 days ago
Optimizing Loop Performance for Clustered VLIW Architectures
Modern embedded systems often require high degrees of instruction-level parallelism (ILP) within strict constraints on power consumption and chip cost. Unfortunately, a high-perfo...
Yi Qian, Steve Carr, Philip H. Sweany
ICS
2000
Tsinghua U.
13 years 11 months ago
Fast greedy weighted fusion
Loop fusion is important to optimizing compilers because it is an important tool in managing the memory hierarchy. By fusing loops that use the same data elements, we can reduce t...
Ken Kennedy
EUROPAR
2001
Springer
13 years 12 months ago
Loop-Carried Code Placement
Abstract. Traditional code optimization techniques treat loops as nonpredictable structures and do not consider expressions containing array accesses for optimization. We show that...
Peter Faber, Martin Griebl, Christian Lengauer