Sciweavers

141 search results - page 19 / 29
» Load Execution Latency Reduction
Sort
View
ISCA
1994
IEEE
129views Hardware» more  ISCA 1994»
14 years 21 days ago
Impact of Sharing-Based Thread Placement on Multithreaded Architectures
Multithreaded architectures context switch between instruction streams to hide memory access latency. Although this improves processor utilization, it can increase cache interfere...
Radhika Thekkath, Susan J. Eggers
ISCA
1997
IEEE
90views Hardware» more  ISCA 1997»
14 years 24 days ago
The Interaction of Software Prefetching with ILP Processors in Shared-Memory Systems
Current microprocessors aggressively exploit instructionlevel parallelism (ILP) through techniques such as multiple issue, dynamic scheduling, and non-blocking reads. Recent work ...
Parthasarathy Ranganathan, Vijay S. Pai, Hazim Abd...
ISCA
1996
IEEE
126views Hardware» more  ISCA 1996»
14 years 23 days ago
Memory Bandwidth Limitations of Future Microprocessors
This paper makes the case that pin bandwidth will be a critical consideration for future microprocessors. We show that many of the techniques used to tolerate growing memory laten...
Doug Burger, James R. Goodman, Alain Kägi
HIPC
2009
Springer
13 years 6 months ago
Distance-aware round-robin mapping for large NUCA caches
In many-core architectures, memory blocks are commonly assigned to the banks of a NUCA cache by following a physical mapping. This mapping assigns blocks to cache banks in a round-...
Alberto Ros, Marcelo Cintra, Manuel E. Acacio, Jos...
HPCA
2005
IEEE
14 years 2 months ago
Microarchitectural Wire Management for Performance and Power in Partitioned Architectures
Future high-performance billion-transistor processors are likely to employ partitioned architectures to achieve high clock speeds, high parallelism, low design complexity, and low...
Rajeev Balasubramonian, Naveen Muralimanohar, Kart...