Sciweavers

356 search results - page 19 / 72
» Towards effective automatic parallelization for multicore sy...
Sort
View
IPPS
2003
IEEE
14 years 1 months ago
Quantifying Locality Effect in Data Access Delay: Memory logP
The application of hardware-parameterized models to distributed systems can result in omission of key bottlenecks such as the full cost of inter-node communication in a shared mem...
Kirk W. Cameron, Xian-He Sun
IPPS
2002
IEEE
14 years 1 months ago
Effective Cross-Platform, Multilevel Parallelism via Dynamic Adaptive Execution
This paper presents preliminary efforts to develop compilation and execution environments that achieve performance portability of multilevel parallelization on hierarchical archit...
Walden Ko, Mark N. Yankelevsky, Dimitrios S. Nikol...
LCPC
1991
Springer
14 years 11 hour ago
Experience in the Automatic Parallelization of Four Perfect-Benchmark Programs
This paper discusses the techniques used to hand-parallelize, for the Alliant FX/80, four Fortran programs from the Perfect-Benchmark suite. The paper also includes the execution ...
Rudolf Eigenmann, Jay Hoeflinger, Zhiyuan Li, Davi...
ICASSP
2009
IEEE
14 years 10 days ago
Generating high performance pruned FFT implementations
We derive a recursive general-radix pruned Cooley-Tukey fast Fourier transform (FFT) algorithm in Kronecker product notation. The algorithm is compatible with vectorization and pa...
Franz Franchetti, Markus Püschel
CAP
2010
13 years 3 months ago
A quantitative study of reductions in algebraic libraries
How much of existing computer algebra libraries is amenable to automatic parallelization? This is a difficult topic, yet of practical importance in the era of commodity multicore ...
Yue Li, Gabriel Dos Reis