Sciweavers

39 search results - page 6 / 8
» Optimized Dense Matrix Multiplication on a Many-Core Archite...
Sort
View
SAIG
2000
Springer
13 years 11 months ago
Code Generators for Automatic Tuning of Numerical Kernels: Experiences with FFTW
Achieving peak performance in important numerical kernels such as dense matrix multiply or sparse-matrix vector multiplication usually requires extensive, machine-dependent tuning ...
Rich Vuduc, James Demmel
ICS
2005
Tsinghua U.
14 years 1 months ago
Think globally, search locally
A key step in program optimization is the determination of optimal values for code optimization parameters such as cache tile sizes and loop unrolling factors. One approach, which...
Kamen Yotov, Keshav Pingali, Paul Stodghill
IPPS
2009
IEEE
14 years 2 months ago
Annotation-based empirical performance tuning using Orio
In many scientific applications, significant time is spent tuning codes for a particular highperformance architecture. Tuning approaches range from the relatively nonintrusive (...
Albert Hartono, Boyana Norris, Ponnuswamy Sadayapp...
IPPS
2007
IEEE
14 years 1 months ago
Memory Optimizations For Fast Power-Aware Sparse Computations
— We consider memory subsystem optimizations for improving the performance of sparse scientific computation while reducing the power consumed by the CPU and memory. We first co...
Konrad Malkowski, Padma Raghavan, Mary Jane Irwin
MOBIHOC
2005
ACM
14 years 7 months ago
Low-coordination topologies for redundancy in sensor networks
Tiny, low-cost sensor devices are expected to be failure-prone and hence in many realistic deployment scenarios for sensor networks these nodes are deployed in higher than necessa...
Rajagopal Iyengar, Koushik Kar, Suman Banerjee