Sciweavers

241 search results - page 19 / 49
» Advanced Loop Optimizations for Parallel Computers
Sort
View
IEEEPACT
2009
IEEE
14 years 2 months ago
Polyhedral-Model Guided Loop-Nest Auto-Vectorization
Abstract—Optimizing compilers apply numerous interdependent optimizations, leading to the notoriously difficult phase-ordering problem — that of deciding which transformations...
Konrad Trifunovic, Dorit Nuzman, Albert Cohen, Aya...
ICPPW
2006
IEEE
14 years 1 months ago
Towards a Source Level Compiler: Source Level Modulo Scheduling
Modulo scheduling is a major optimization of high performance compilers wherein The body of a loop is replaced by an overlapping of instructions from different iterations. Hence ...
Yosi Ben-Asher, Danny Meisler
ICS
2001
Tsinghua U.
13 years 12 months ago
Computer aided hand tuning (CAHT): "applying case-based reasoning to performance tuning"
For most parallel and high performance systems, tuning guides provide the users with advices to optimize the execution time of their programs. Execution time may be very sensitive...
Antoine Monsifrot, François Bodin
HPCA
2004
IEEE
14 years 7 months ago
Creating Converged Trace Schedules Using String Matching
This paper focuses on generating efficient software pipelined schedules for in-order machines, which we call Converged Trace Schedules. For a candidate loop, we form a string of t...
Satish Narayanasamy, Yuanfang Hu, Suleyman Sair, B...
CF
2005
ACM
13 years 9 months ago
A case for a working-set-based memory hierarchy
Modern microprocessor designs continue to obtain impressive performance gains through increasing clock rates and advances in the parallelism obtained via micro-architecture design...
Steve Carr, Soner Önder