Sciweavers

193 search results - page 36 / 39
» Automatic Parallelization and Optimization of Programs by Pr...
Sort
View
IEEEPACT
2008
IEEE
14 years 1 months ago
A tuning framework for software-managed memory hierarchies
Achieving good performance on a modern machine with a multi-level memory hierarchy, and in particular on a machine with software-managed memories, requires precise tuning of progr...
Manman Ren, Ji Young Park, Mike Houston, Alex Aike...
ICASSP
2009
IEEE
13 years 11 months ago
Generating high performance pruned FFT implementations
We derive a recursive general-radix pruned Cooley-Tukey fast Fourier transform (FFT) algorithm in Kronecker product notation. The algorithm is compatible with vectorization and pa...
Franz Franchetti, Markus Püschel
IEEEPACT
2008
IEEE
14 years 1 months ago
Exploiting loop-dependent stream reuse for stream processors
The memory access limits the performance of stream processors. By exploiting the reuse of data held in the Stream Register File (SRF), an on-chip storage, the number of memory acc...
Xuejun Yang, Ying Zhang, Jingling Xue, Ian Rogers,...
ICPP
2003
IEEE
14 years 23 days ago
Procedural Level Address Offset Assignment of DSP Applications with Loops
Automatic optimization of address offset assignment for DSP applications, which reduces the number of address arithmetic instructions to meet the tight memory size restrictions an...
Youtao Zhang, Jun Yang 0002
IEEEPACT
2006
IEEE
14 years 1 months ago
An empirical evaluation of chains of recurrences for array dependence testing
Code restructuring compilers rely heavily on program analysis techniques to automatically detect data dependences between program statements. Dependences between statement instanc...
Johnnie Birch, Robert A. van Engelen, Kyle A. Gall...