Sciweavers

955 search results - page 19 / 191
» Performance optimization of multiple memory architectures fo...
Sort
View
ICPP
2008
IEEE
14 years 3 months ago
Improving the Performance of Multithreaded Sparse Matrix-Vector Multiplication Using Index and Value Compression
Abstract—The Sparse Matrix-Vector Multiplication kernel exhibits limited potential for taking advantage of modern shared memory architectures due to its large memory bandwidth re...
Kornilios Kourtis, Georgios I. Goumas, Nectarios K...
ARCS
2009
Springer
14 years 3 months ago
Improving Memory Subsystem Performance Using ViVA: Virtual Vector Architecture
The disparity between microprocessor clock frequencies and memory latency is a primary reason why many demanding applications run well below peak achievable performance. Software c...
Joseph Gebis, Leonid Oliker, John Shalf, Samuel Wi...
HPCA
1995
IEEE
14 years 16 days ago
Program Balance and Its Impact on High Performance RISC Architectures
Information on the behavior of programs is essential for deciding the number and nature of functional units in high performance architectures. In this paper, we present studies on...
Lizy Kurian John, Vinod Reddy, Paul T. Hulina, Lee...
APCSAC
2004
IEEE
14 years 22 days ago
Continuous Adaptive Object-Code Re-optimization Framework
Dynamic optimization presents opportunities for finding run-time bottlenecks and deploying optimizations in statically compiled programs. In this paper, we discuss our current impl...
Howard Chen, Jiwei Lu, Wei-Chung Hsu, Pen-Chung Ye...
ISCA
1996
IEEE
130views Hardware» more  ISCA 1996»
14 years 1 months ago
Informing Memory Operations: Providing Memory Performance Feedback in Modern Processors
Memory latency is an important bottleneck in system performance that cannot be adequately solved by hardware alone. Several promising software techniques have been shown to addres...
Mark Horowitz, Margaret Martonosi, Todd C. Mowry, ...