Sciweavers

60 search results - page 9 / 12
» DPF: A Data Parallel Fortran Benchmark Suite
Sort
View
131
Voted
IPPS
2003
IEEE
15 years 9 months ago
Extending OpenMP to Support Slipstream Execution Mode
OpenMP has emerged as a widely accepted standard for writing shared memory programs. Hardware-specific extensions such as data placement are usually needed to improve the scalabi...
Khaled Z. Ibrahim, Gregory T. Byrd
108
Voted
SPAA
2009
ACM
16 years 23 days ago
Optimizing transactions for captured memory
In this paper, we identify transaction-local memory as a major source of overhead from compiler instrumentation in software transactional memory (STM). Transaction-local memory is...
Aleksandar Dragojevic, Yang Ni, Ali-Reza Adl-Tabat...
121
Voted
HPCA
1999
IEEE
15 years 8 months ago
Dynamically Exploiting Narrow Width Operands to Improve Processor Power and Performance
In general-purpose microprocessors, recent trends have pushed towards 64-bit word widths, primarily to accommodate the large addressing needs of some programs. Many integer proble...
David Brooks, Margaret Martonosi
118
Voted
IEEEPACT
2005
IEEE
15 years 9 months ago
Future Execution: A Hardware Prefetching Technique for Chip Multiprocessors
This paper proposes a new hardware technique for using one core of a CMP to prefetch data for a thread running on another core. Our approach simply executes a copy of all non-cont...
Ilya Ganusov, Martin Burtscher
136
Voted
IJHPCA
2010
105views more  IJHPCA 2010»
15 years 2 months ago
A Pipelined Algorithm for Large, Irregular All-Gather Problems
We describe and evaluate a new, pipelined algorithm for large, irregular all-gather problems. In the irregular all-gather problem each process in a set of processes contributes in...
Jesper Larsson Träff, Andreas Ripke, Christia...