Sciweavers

572 search results - page 83 / 115
» A Performance Prediction Methodology for Data-dependent Para...
Sort
View
CLUSTER
2011
IEEE
12 years 8 months ago
Exploring Fine-Grained Task-Based Execution on Multi-GPU Systems
Using multi-GPU systems, including GPU clusters, is gaining popularity in scientific computing. However, when using multiple GPUs concurrently, the conventional data parallel GPU...
Long Chen, Oreste Villa, Guang R. Gao
FCCM
2008
IEEE
115views VLSI» more  FCCM 2008»
14 years 3 months ago
Simultaneous Retiming and Placement for Pipelined Netlists
Although pipelining or C-slowing an FPGA-based application can potentially dramatically improve the performance, this poses a question for conventional reconfigurable architecture...
Kenneth Eguro, Scott Hauck
ICPP
1999
IEEE
14 years 1 months ago
SLC: Symbolic Scheduling for Executing Parameterized Task Graphs on Multiprocessors
Task graph scheduling has been found effective in performance prediction and optimization of parallel applications. A number of static scheduling algorithms have been proposed for...
Michel Cosnard, Emmanuel Jeannot, Tao Yang
ISCAPDCS
2003
13 years 10 months ago
Loop Transformation Techniques To Aid In Loop Unrolling and Multithreading
In modern computer systems loops present a great deal of opportunities for increasing Instruction Level and Thread Level Parallelism. Loop unrolling is a technique used to obtain ...
Litong Song, Yuhua Zhang, Krishna M. Kavi
IPPS
2008
IEEE
14 years 3 months ago
Massive supercomputing coping with heterogeneity of modern accelerators
Heterogeneous supercomputers with combined general purpose and accelerated CPUs promise to be the future major architecture due to their wideranging generality and superior perfor...
Toshio Endo, Satoshi Matsuoka