Sciweavers

DAC
2012
ACM
12 years 2 months ago
Run-time power-down strategies for real-time SDRAM memory controllers
Powering down SDRAMs at run-time reduces memory energy consumption significantly, but often at the cost of performance. If employed speculatively with real-time memory controller...
Karthik Chandrasekar 0001, Benny Akesson, Kees Goo...
MICRO
2010
IEEE
270views Hardware» more  MICRO 2010»
13 years 10 months ago
Many-Thread Aware Prefetching Mechanisms for GPGPU Applications
Abstract-- We consider the problem of how to improve memory latency tolerance in massively multithreaded GPGPUs when the thread-level parallelism of an application is not sufficien...
Jaekyu Lee, Nagesh B. Lakshminarayana, Hyesoon Kim...
HICSS
1995
IEEE
109views Biometrics» more  HICSS 1995»
14 years 4 months ago
The architecture of an optimistic CPU: the WarpEngine
The architecture for a shared memory CPU is described. The CPU allows for parallelism down to the level of single instructions and is tolerant of memory latency. All executable in...
John G. Cleary, Murray Pearson, Husam Kinawi
ICS
2000
Tsinghua U.
14 years 4 months ago
Push vs. pull: data movement for linked data structures
As the performance gap between the CPU and main memory continues to grow, techniques to hide memory latency are essential to deliver a high performance computer system. Prefetchin...
Chia-Lin Yang, Alvin R. Lebeck
ISCA
1996
IEEE
130views Hardware» more  ISCA 1996»
14 years 4 months ago
Informing Memory Operations: Providing Memory Performance Feedback in Modern Processors
Memory latency is an important bottleneck in system performance that cannot be adequately solved by hardware alone. Several promising software techniques have been shown to addres...
Mark Horowitz, Margaret Martonosi, Todd C. Mowry, ...
EUROMICRO
1998
IEEE
14 years 4 months ago
The Latency Hiding Effectiveness of Decoupled Access/Execute Processors
Several studies have demonstrated that out-of-order execution processors may not be the most adequate organization for wide issue processors due to the increasing penalties that w...
Joan-Manuel Parcerisa, Antonio González
ICS
1999
Tsinghua U.
14 years 4 months ago
An experimental evaluation of tiling and shackling for memory hierarchy management
On modern computers, the performance of programs is often limited by memory latency rather than by processor cycle time. To reduce the impact of memory latency, the restructuring ...
Induprakas Kodukula, Keshav Pingali, Robert Cox, D...
ISSS
1999
IEEE
89views Hardware» more  ISSS 1999»
14 years 4 months ago
Loop Scheduling and Partitions for Hiding Memory Latencies
Partition Scheduling with Prefetching (PSP) is a memory latency hiding technique which combines the loop pipelining technique with data prefetching. In PSP, the iteration space is...
Fei Chen, Edwin Hsing-Mean Sha
PLDI
1999
ACM
14 years 4 months ago
Cache-Conscious Structure Layout
Hardware trends have produced an increasing disparity between processor speeds and memory access times. While a variety of techniques for tolerating or reducing memory latency hav...
Trishul M. Chilimbi, Mark D. Hill, James R. Larus
IPPS
2000
IEEE
14 years 4 months ago
The Memory Bandwidth Bottleneck and its Amelioration by a Compiler
As the speed gap between CPU and memory widens, memory hierarchy has become the primary factor limiting program performance. Until now, the principal focus of hardware and softwar...
Chen Ding, Ken Kennedy