Sciweavers

1001 search results - page 72 / 201
» Improving memory hierarchy performance for irregular applica...
Sort
View
ISCA
2012
IEEE
262views Hardware» more  ISCA 2012»
13 years 8 months ago
Boosting mobile GPU performance with a decoupled access/execute fragment processor
Smartphones represent one of the fastest growing markets, providing significant hardware/software improvements every few months. However, supporting these capabilities reduces the...
Jose-Maria Arnau, Joan-Manuel Parcerisa, Polychron...
SBACPAD
2006
IEEE
102views Hardware» more  SBACPAD 2006»
16 years 8 days ago
Ultra-Fast CPU Performance Prediction: Extending the Monte Carlo Approach
Performance evaluation of contemporary processors is becoming increasingly difficult due to the lack of proper frameworks. Traditionally, cycle-accurate simulators have been exte...
Ram Srinivasan, Jeanine Cook, Olaf M. Lubeck
132
Voted
DAC
2005
ACM
16 years 7 months ago
Improving java virtual machine reliability for memory-constrained embedded systems
Dual-execution/checkpointing based transient error tolerance techniques have been widely used in the high-end mission critical systems. These techniques, however, are not very att...
Guangyu Chen, Mahmut T. Kandemir
CF
2007
ACM
15 years 10 months ago
An analysis of the effects of miss clustering on the cost of a cache miss
In this paper we describe a new technique, called pipeline spectroscopy, and use it to measure the cost of each cache miss. The cost of a miss is displayed (graphed) as a histogra...
Thomas R. Puzak, Allan Hartstein, Philip G. Emma, ...
MICRO
2010
IEEE
270views Hardware» more  MICRO 2010»
15 years 4 months ago
Many-Thread Aware Prefetching Mechanisms for GPGPU Applications
Abstract-- We consider the problem of how to improve memory latency tolerance in massively multithreaded GPGPUs when the thread-level parallelism of an application is not sufficien...
Jaekyu Lee, Nagesh B. Lakshminarayana, Hyesoon Kim...