Sciweavers

955 search results - page 19 / 191
» Performance optimization of multiple memory architectures fo...
Sort
View
152
Voted
ICPP
2008
IEEE
15 years 9 months ago
Improving the Performance of Multithreaded Sparse Matrix-Vector Multiplication Using Index and Value Compression
Abstract—The Sparse Matrix-Vector Multiplication kernel exhibits limited potential for taking advantage of modern shared memory architectures due to its large memory bandwidth re...
Kornilios Kourtis, Georgios I. Goumas, Nectarios K...
125
Voted
ARCS
2009
Springer
15 years 9 months ago
Improving Memory Subsystem Performance Using ViVA: Virtual Vector Architecture
The disparity between microprocessor clock frequencies and memory latency is a primary reason why many demanding applications run well below peak achievable performance. Software c...
Joseph Gebis, Leonid Oliker, John Shalf, Samuel Wi...
124
Voted
HPCA
1995
IEEE
15 years 6 months ago
Program Balance and Its Impact on High Performance RISC Architectures
Information on the behavior of programs is essential for deciding the number and nature of functional units in high performance architectures. In this paper, we present studies on...
Lizy Kurian John, Vinod Reddy, Paul T. Hulina, Lee...
143
Voted
APCSAC
2004
IEEE
15 years 6 months ago
Continuous Adaptive Object-Code Re-optimization Framework
Dynamic optimization presents opportunities for finding run-time bottlenecks and deploying optimizations in statically compiled programs. In this paper, we discuss our current impl...
Howard Chen, Jiwei Lu, Wei-Chung Hsu, Pen-Chung Ye...
140
Voted
ISCA
1996
IEEE
130views Hardware» more  ISCA 1996»
15 years 6 months ago
Informing Memory Operations: Providing Memory Performance Feedback in Modern Processors
Memory latency is an important bottleneck in system performance that cannot be adequately solved by hardware alone. Several promising software techniques have been shown to addres...
Mark Horowitz, Margaret Martonosi, Todd C. Mowry, ...