Sciweavers

67 search results - page 6 / 14
» Data transformations enabling loop vectorization on multithr...
Sort
View
EUROPAR
2000
Springer
13 years 11 months ago
Automatic SIMD Parallelization of Embedded Applications Based on Pattern Recognition
This paper investigates the potential for automatic mapping of typical embedded applications to architectures with multimedia instruction set extensions. For this purpose a (patter...
Rashindra Manniesing, Ireneusz Karkowski, Henk Cor...
ICS
2009
Tsinghua U.
14 years 2 months ago
Parametric multi-level tiling of imperfectly nested loops
Tiling is a crucial loop transformation for generating high performance code on modern architectures. Efficient generation of multilevel tiled code is essential for maximizing da...
Albert Hartono, Muthu Manikandan Baskaran, C&eacut...
HPCA
2008
IEEE
14 years 7 months ago
Uncovering hidden loop level parallelism in sequential applications
As multicore systems become the dominant mainstream computing technology, one of the most difficult challenges the industry faces is the software. Applications with large amounts ...
Hongtao Zhong, Mojtaba Mehrara, Steven A. Lieberma...
ICS
2009
Tsinghua U.
14 years 2 months ago
Computer generation of fast fourier transforms for the cell broadband engine
The Cell BE is a multicore processor with eight vector accelerators (called SPEs) that implement explicit cache management through direct memory access engines. While the Cell has...
Srinivas Chellappa, Franz Franchetti, Markus P&uum...
IPPS
2007
IEEE
14 years 1 months ago
Improving Scalability of OpenMP Applications on Multi-core Systems Using Large Page Support
Modern multi-core architectures have become popular because of the limitations of deep pipelines and heating and power concerns. Some of these multi-core architectures such as the...
Ranjit Noronha, Dhabaleswar K. Panda