Sciweavers

356 search results - page 19 / 72
» Towards effective automatic parallelization for multicore sy...
Sort
View
IPPS
2003
IEEE
15 years 8 months ago
Quantifying Locality Effect in Data Access Delay: Memory logP
The application of hardware-parameterized models to distributed systems can result in omission of key bottlenecks such as the full cost of inter-node communication in a shared mem...
Kirk W. Cameron, Xian-He Sun
IPPS
2002
IEEE
15 years 8 months ago
Effective Cross-Platform, Multilevel Parallelism via Dynamic Adaptive Execution
This paper presents preliminary efforts to develop compilation and execution environments that achieve performance portability of multilevel parallelization on hierarchical archit...
Walden Ko, Mark N. Yankelevsky, Dimitrios S. Nikol...
133
Voted
LCPC
1991
Springer
15 years 7 months ago
Experience in the Automatic Parallelization of Four Perfect-Benchmark Programs
This paper discusses the techniques used to hand-parallelize, for the Alliant FX/80, four Fortran programs from the Perfect-Benchmark suite. The paper also includes the execution ...
Rudolf Eigenmann, Jay Hoeflinger, Zhiyuan Li, Davi...
136
Voted
ICASSP
2009
IEEE
15 years 7 months ago
Generating high performance pruned FFT implementations
We derive a recursive general-radix pruned Cooley-Tukey fast Fourier transform (FFT) algorithm in Kronecker product notation. The algorithm is compatible with vectorization and pa...
Franz Franchetti, Markus Püschel
118
Voted
CAP
2010
14 years 10 months ago
A quantitative study of reductions in algebraic libraries
How much of existing computer algebra libraries is amenable to automatic parallelization? This is a difficult topic, yet of practical importance in the era of commodity multicore ...
Yue Li, Gabriel Dos Reis