Search Sciweavers | Sciweavers

272 search results - page 6 / 55

» Code Transformations to Improve Memory Parallelism

151

Voted

IPPS
1999
IEEE

161views Distributed And Parallel Com...» more IPPS 1999»

A Graph Based Framework to Detect Optimal Memory Layouts for Improving Data Locality

15 years 6 months ago

Download www.eecs.northwestern.edu

In order to extract high levels of performance from modern parallel architectures, the effective management of deep memory hierarchies is very important. While architectural advan...

Mahmut T. Kandemir, Alok N. Choudhary, J. Ramanuja...

claim paper

Read More »

137

click to vote

PPOPP
2010
ACM

353views Distributed and Parallel Com...» more PPOPP 2010»

Data transformations enabling loop vectorization on multithreaded data parallel architectures

15 years 11 months ago

Download www.ece.neu.edu

Loop vectorization, a key feature exploited to obtain high performance on Single Instruction Multiple Data (SIMD) vector architectures, is significantly hindered by irregular memo...

Byunghyun Jang, Perhaad Mistry, Dana Schaa, Rodrig...

claim paper

Read More »

122

click to vote

IPPS
2007
IEEE

143views Distributed And Parallel Com...» more IPPS 2007»

Optimizing Inter-Nest Data Locality Using Loop Splitting and Reordering

15 years 8 months ago

Download www.cecs.uci.edu

With the increasing gap between processor speed and memory latency, the performance of data-dominated programs are becoming more reliant on fast data access, which can be improved...

Sofiane Naci

claim paper

Read More »

132

click to vote

IEEEPACT
1999
IEEE

157views Distributed And Parallel Com...» more IEEEPACT 1999»

On Reducing False Sharing while Improving Locality on Shared Memory Multiprocessors

15 years 6 months ago

Download cucis.ece.northwestern.edu

The performance of applications on large shared-memory multiprocessors with coherent caches depends on the interaction between the granularity of data sharing, the size of the coh...

Mahmut T. Kandemir, Alok N. Choudhary, J. Ramanuja...

claim paper

Read More »

129

click to vote

PLDI
1995
ACM

122views Programming Languages» more PLDI 1995»

Improving Balanced Scheduling with Compiler Optimizations that Increase Instruction-Level Parallelism

15 years 5 months ago

Download reference.kfupm.edu.sa

Traditional list schedulers order instructions based on an optimistic estimate of the load latency imposed by the hardware and therefore cannot respond to variations in memory lat...

Jack L. Lo, Susan J. Eggers

claim paper

Read More »

« Prev « First page 6 / 55 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers