Sciweavers

272 search results - page 44 / 55
» Code Transformations to Improve Memory Parallelism
Sort
View
ISCA
2008
IEEE
148views Hardware» more  ISCA 2008»
14 years 3 months ago
Atomic Vector Operations on Chip Multiprocessors
The current trend is for processors to deliver dramatic improvements in parallel performance while only modestly improving serial performance. Parallel performance is harvested th...
Sanjeev Kumar, Daehyun Kim, Mikhail Smelyanskiy, Y...
EUROPAR
2003
Springer
14 years 1 months ago
Partial Redundancy Elimination with Predication Techniques
Partial redundancy elimination (PRE) techniques play an important role in optimizing compilers. Many optimizations, such as elimination of redundant expressions, communication opti...
Bernhard Scholz, Eduard Mehofer, R. Nigel Horspool
IPPS
2008
IEEE
14 years 3 months ago
Lattice Boltzmann simulation optimization on leading multicore platforms
We present an auto-tuning approach to optimize application performance on emerging multicore architectures. The methodology extends the idea of searchbased performance optimizatio...
Samuel Williams, Jonathan Carter, Leonid Oliker, J...
ISCA
2002
IEEE
104views Hardware» more  ISCA 2002»
13 years 8 months ago
Speculative Dynamic Vectorization
Traditional vector architectures have shown to be very effective for regular codes where the compiler can detect data-level parallelism. However, this SIMD parallelism is also pre...
Alex Pajuelo, Antonio González, Mateo Valer...
ICPP
1995
IEEE
14 years 6 days ago
Progress: A Toolkit for Interactive Program Steering
Interactive program steering permits researchers to monitor and guide their applications during runtime. Interactive steering can help make end users more effective in addressing ...
Jeffrey S. Vetter, Karsten Schwan