Sciweavers

134 search results - page 2 / 27
» Locating cache performance bottlenecks using data profiling
Sort
View
MICRO
1997
IEEE
90views Hardware» more  MICRO 1997»
13 years 11 months ago
ProfileMe: Hardware Support for Instruction-Level Profiling on Out-of-Order Processors
Profile data is valuable for identifying performance bottlenecks and guiding optimizations. Periodic sampling of a processor's performance monitoring hardware is an effective...
Jeffrey Dean, James E. Hicks, Carl A. Waldspurger,...
VALUETOOLS
2006
ACM
167views Hardware» more  VALUETOOLS 2006»
14 years 1 months ago
Detailed cache simulation for detecting bottleneck, miss reason and optimization potentialities
Cache locality optimization is an efficient way for reducing the idle time of modern processors in waiting for needed data. This kind of optimization can be achieved either on the...
Jie Tao, Wolfgang Karl
LCTRTS
2007
Springer
14 years 1 months ago
Addressing instruction fetch bottlenecks by using an instruction register file
The Instruction Register File (IRF) is an architectural extension for providing improved access to frequently occurring instructions. An optimizing compiler can exploit an IRF by ...
Stephen Roderick Hines, Gary S. Tyson, David B. Wh...
ISLPED
2004
ACM
137views Hardware» more  ISLPED 2004»
14 years 1 months ago
Location cache: a low-power L2 cache system
While set-associative caches incur fewer misses than directmapped caches, they typically have slower hit times and higher power consumption, when multiple tag and data banks are p...
Rui Min, Wen-Ben Jone, Yiming Hu
CF
2006
ACM
13 years 9 months ago
Intermediately executed code is the key to find refactorings that improve temporal data locality
The growing speed gap between memory and processor makes an efficient use of the cache ever more important to reach high performance. One of the most important ways to improve cac...
Kristof Beyls, Erik H. D'Hollander