Sciweavers

173 search results - page 17 / 35
» Loop Parallelization Algorithms: From Parallelism Extraction...
Sort
View
LCTRTS
2005
Springer
14 years 1 months ago
Complementing software pipelining with software thread integration
Software pipelining is a critical optimization for producing efficient code for VLIW/EPIC and superscalar processors in highperformance embedded applications such as digital sign...
Won So, Alexander G. Dean
ISSTA
2010
ACM
13 years 8 months ago
Robust non-intrusive record-replay with processor extraction
With the advent of increasingly larger parallel machines, debugging is becoming more and more challenging. In particular, applications at this scale tend to behave non-determinist...
Filippo Gioachin, Gengbin Zheng, Laxmikant V. Kal&...
IEEEPACT
2009
IEEE
13 years 6 months ago
Algorithmic Skeletons within an Embedded Domain Specific Language for the CELL Processor
Efficiently using the hardware capabilities of the Cell processor, a heterogeneous chip multiprocessor that uses several levels of parallelism to deliver high performance, and bei...
Tarik Saidani, Joel Falcou, Claude Tadonki, Lionel...
FCCM
2011
IEEE
331views VLSI» more  FCCM 2011»
13 years 5 days ago
Synthesis of Platform Architectures from OpenCL Programs
—The problem of automatically generating hardware modules from a high level representation of an application has been at the research forefront in the last few years. In this pap...
Muhsen Owaida, Nikolaos Bellas, Konstantis Dalouka...
SPAA
2006
ACM
14 years 2 months ago
Towards automatic parallelization of tree reductions in dynamic programming
Tree contraction algorithms, whose idea was first proposed by Miller and Reif, are important parallel algorithms to implement efficient parallel programs manipulating trees. Desp...
Kiminori Matsuzaki, Zhenjiang Hu, Masato Takeichi