Sciweavers

468 search results - page 60 / 94
» A compiler for high performance computing with many-core acc...
Sort
View
IPPS
2009
IEEE
14 years 2 months ago
Flexible pipelining design for recursive variable expansion
Many image and signal processing kernels can be optimized for performance consuming a reasonable area by doing loops parallelization with extensive use of pipelining. This paper p...
Zubair Nawaz, Thomas Marconi, Koen Bertels, Todor ...
PPOPP
2009
ACM
14 years 8 months ago
Effective performance measurement and analysis of multithreaded applications
Understanding why the performance of a multithreaded program does not improve linearly with the number of cores in a sharedmemory node populated with one or more multicore process...
Nathan R. Tallent, John M. Mellor-Crummey
ISHPC
1999
Springer
13 years 12 months ago
Instruction-Level Microprocessor Modeling of Scientific Applications
Superscalar microprocessor efficiency is generally not as high as anticipated. In fact, sustained utilization below thirty percent of peak is not uncommon, even for fully optimized...
Kirk W. Cameron, Yong Luo, James Scharzmeier
PPOPP
2010
ACM
14 years 2 months ago
An adaptive performance modeling tool for GPU architectures
This paper presents an analytical model to predict the performance of general-purpose applications on a GPU architecture. The model is designed to provide performance information ...
Sara S. Baghsorkhi, Matthieu Delahaye, Sanjay J. P...
FPGA
2010
ACM
232views FPGA» more  FPGA 2010»
13 years 7 months ago
High-throughput bayesian computing machine with reconfigurable hardware
We use reconfigurable hardware to construct a high throughput Bayesian computing machine (BCM) capable of evaluating probabilistic networks with arbitrary DAG (directed acyclic gr...
Mingjie Lin, Ilia Lebedev, John Wawrzynek