Search Sciweavers | Sciweavers

260 search results - page 23 / 52

» Performance Modelling and Optimization of Memory Access on C...

144

click to vote

PPOPP
2010
ACM

353views Distributed and Parallel Com...» more PPOPP 2010»

Data transformations enabling loop vectorization on multithreaded data parallel architectures

16 years 13 days ago

Download www.ece.neu.edu

Loop vectorization, a key feature exploited to obtain high performance on Single Instruction Multiple Data (SIMD) vector architectures, is significantly hindered by irregular memo...

Byunghyun Jang, Perhaad Mistry, Dana Schaa, Rodrig...

claim paper

Read More »

141

click to vote

EUROPAR
1997
Springer

126views Distributed And Parallel Com...» more EUROPAR 1997»

Modulo Scheduling with Cache Reuse Information

15 years 7 months ago

Download www.cs.rochester.edu

Instruction scheduling in general, and software pipelining in particular face the di cult task of scheduling operations in the presence of uncertain latencies. The largest contrib...

Chen Ding, Steve Carr, Philip H. Sweany

claim paper

Read More »

129

click to vote

HPCA
2009
IEEE

176views Distributed And Parallel Com...» more HPCA 2009»

Design and implementation of software-managed caches for multicores with local memory

16 years 3 months ago

Download www.multicoreinfo.com

Heterogeneous multicores, such as Cell BE processors and GPGPUs, typically do not have caches for their accelerator cores because coherence traffic, cache misses, and latencies fr...

Sangmin Seo, Jaejin Lee, Zehra Sura

claim paper

Read More »

173

click to vote

IPPS
1997
IEEE

139views Distributed And Parallel Com...» more IPPS 1997»

DPF: A Data Parallel Fortran Benchmark Suite

15 years 7 months ago

Download ipdps.cc.gatech.edu

We present the Data Parallel Fortran (DPF) benchmark suite, a set of data parallel Fortran codes forevaluatingdata parallel compilers appropriatefor any target parallel architectu...

Y. Charlie Hu, S. Lennart Johnsson, Dimitris Kehag...

claim paper

Read More »

140

click to vote

PC
2010

190views Management» more PC 2010»

High-performance cone beam reconstruction using CUDA compatible GPUs

15 years 1 months ago

Download www-hagi.ist.osaka-u.ac.jp

Compute uniﬁed device architecture (CUDA) is a software development platform that allows us to run C-like programs on the nVIDIA graphics processing unit (GPU). This paper prese...

Yusuke Okitsu, Fumihiko Ino, Kenichi Hagihara

claim paper

Read More »

« Prev « First page 23 / 52 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers