Sciweavers

201 search results - page 32 / 41
» Estimating the Parallel Start-Up Overhead for Parallelizing ...
Sort
View
PPOPP
2009
ACM
14 years 8 months ago
A compiler-directed data prefetching scheme for chip multiprocessors
Data prefetching has been widely used in the past as a technique for hiding memory access latencies. However, data prefetching in multi-threaded applications running on chip multi...
Dhruva Chakrabarti, Mahmut T. Kandemir, Mustafa Ka...
ISHPC
2000
Springer
13 years 11 months ago
Leveraging Transparent Data Distribution in OpenMP via User-Level Dynamic Page Migration
This paper describes transparent mechanisms for emulating some of the data distribution facilities offered by traditional data-parallel programming models, such as High Performance...
Dimitrios S. Nikolopoulos, Theodore S. Papatheodor...
IEEEPACT
2005
IEEE
14 years 1 months ago
An Event-Driven Multithreaded Dynamic Optimization Framework
Dynamic optimization has the potential to adapt the program’s behavior at run-time to deliver performance improvements over static optimization. Dynamic optimization systems usu...
Weifeng Zhang, Brad Calder, Dean M. Tullsen
CASES
2009
ACM
13 years 11 months ago
Exploiting residue number system for power-efficient digital signal processing in embedded processors
2's complement number system imposes a fundamental limitation on the power and performance of arithmetic circuits, due to the fundamental need of cross-datapath carry propaga...
Rooju Chokshi, Krzysztof S. Berezowski, Aviral Shr...
IWOMP
2009
Springer
14 years 2 months ago
Scalability Evaluation of Barrier Algorithms for OpenMP
OpenMP relies heavily on barrier synchronization to coordinate the work of threads that are performing the computations in a parallel region. A good implementation of barriers is ...
Ramachandra C. Nanjegowda, Oscar Hernandez, Barbar...