Sciweavers

84 search results - page 14 / 17
» Loop Distribution and Fusion with Timing and Code Size Optim...
Sort
View
CASES
2006
ACM
14 years 1 months ago
Adapting compilation techniques to enhance the packing of instructions into registers
The architectural design of embedded systems is becoming increasingly idiosyncratic to meet varying constraints regarding energy consumption, code size, and execution time. Tradit...
Stephen Hines, David B. Whalley, Gary S. Tyson
IPPS
2009
IEEE
14 years 2 months ago
Annotation-based empirical performance tuning using Orio
In many scientific applications, significant time is spent tuning codes for a particular highperformance architecture. Tuning approaches range from the relatively nonintrusive (...
Albert Hartono, Boyana Norris, Ponnuswamy Sadayapp...
ICPPW
2005
IEEE
14 years 1 months ago
Speculative Parallel Threading Architecture and Compilation
Thread-level speculation is a technique that brings thread-level parallelism beyond the data-flow limit by executing a piece of code ahead of time speculatively before all its inp...
Xiao-Feng Li, Zhao-Hui Du, Chen Yang, Chu-Cheow Li...
SPAA
2003
ACM
14 years 21 days ago
Performance comparison of MPI and three openMP programming styles on shared memory multiprocessors
When using a shared memory multiprocessor, the programmer faces the selection of the portable programming model which will deliver the best performance. Even if he restricts his c...
Géraud Krawezik
VALUETOOLS
2006
ACM
167views Hardware» more  VALUETOOLS 2006»
14 years 1 months ago
Detailed cache simulation for detecting bottleneck, miss reason and optimization potentialities
Cache locality optimization is an efficient way for reducing the idle time of modern processors in waiting for needed data. This kind of optimization can be achieved either on the...
Jie Tao, Wolfgang Karl