cache misses | Sciweavers

59

EUROSYS
2010
ACM

139views Software Engineering» more EUROSYS 2010»

Locating cache performance bottlenecks using data profiling

14 years 11 months ago

Effective use of CPU data caches is critical to good performance, but poor cache use patterns are often hard to spot using existing execution profiling tools. Typical profilers at...

Aleksey Pesterev, Nickolai Zeldovich, Robert T. Mo...

claim paper

Read More »

84

click to vote

ISSS
1996
IEEE

123views Hardware» more ISSS 1996»

Memory Organization for Improved Data Cache Performance in Embedded Processors

14 years 11 months ago

Download www.cs.york.ac.uk

Code generation for embedded processors creates opportunities for several performance optimizations not applicable for traditional compilers. We present techniques for improving d...

Preeti Ranjan Panda, Nikil D. Dutt, Alexandru Nico...

claim paper

Read More »

55

click to vote

SC
2000
ACM

83views Applied Computing» more SC 2000»

Using Hardware Performance Monitors to Isolate Memory Bottlenecks

14 years 12 months ago

Download www.cs.umd.edu

In this paper, we present and evaluate two techniques that use different styles of hardware support to provide data structure specific processor cache information. In one approach...

Bryan R. Buck, Jeffrey K. Hollingsworth

claim paper

Read More »

76

click to vote

ISCA
2000
IEEE

111views Hardware» more ISCA 2000»

Understanding the backward slices of performance degrading instructions

14 years 12 months ago

Download www.ece.lsu.edu

For many applications, branch mispredictions and cache misses limit a processor’s performance to a level well below its peak instruction throughput. A small fraction of static i...

Craig B. Zilles, Gurindar S. Sohi

claim paper

Read More »

71

click to vote

HPCA
2000
IEEE

98views Distributed And Parallel Com...» more HPCA 2000»

Software-Controlled Multithreading Using Informing Memory Operations

14 years 12 months ago

Download reports-archive.adm.cs.cmu.edu

Memorylatency isbecominganincreasingly importantperformance bottleneck, especially in multiprocessors. One technique for tolerating memory latency is multithreading, whereby we sw...

Todd C. Mowry, Sherwyn R. Ramkissoon

claim paper

Read More »

85

click to vote

MICRO
2002
IEEE

164views Hardware» more MICRO 2002»

A quantitative framework for automated pre-execution thread selection

15 years 13 days ago

Download www.cis.upenn.edu

Pre-execution attacks cache misses for which conventional address-prediction driven prefetching is ineffective. In pre-execution, copies of cache miss computations are isolated fr...

Amir Roth, Gurindar S. Sohi

claim paper

Read More »

57

click to vote

ISCA
2010
IEEE

232views Hardware» more ISCA 2010»

Data marshaling for multi-core architectures

15 years 19 days ago

Download www.ece.cmu.edu

Previous research has shown that Staged Execution (SE), i.e., dividing a program into segments and executing each segment at the core that has the data and/or functionality to bes...

M. Aater Suleman, Onur Mutlu, José A. Joao,...

claim paper

Read More »

54

click to vote

ICS
2003
Tsinghua U.

111views Distributed And Parallel Com...» more ICS 2003»

Enhancing memory level parallelism via recovery-free value prediction

15 years 22 days ago

Download www.cs.ucf.edu

—The ever-increasing computational power of contemporary microprocessors reduces the execution time spent on arithmetic computations (i.e., the computations not involving slow me...

Huiyang Zhou, Thomas M. Conte

claim paper

Read More »

97

click to vote

SIGMETRICS
2003
ACM

147views Hardware» more SIGMETRICS 2003»

Effect of node size on the performance of cache-conscious B+-trees

15 years 23 days ago

Download www.eecs.umich.edu

In main-memory databases, the number of processor cache misses has a critical impact on the performance of the system. Cacheconscious indices are designed to improve performance b...

Richard A. Hankins, Jignesh M. Patel

claim paper

Read More »

52

click to vote

IPPS
2003
IEEE

105views Distributed And Parallel Com...» more IPPS 2003»

Miss Penalty Reduction Using Bundled Capacity Prefetching in Multiprocessors

15 years 24 days ago

Download www.it.uu.se

While prefetch has proven itself useful for reducing cache misses in multiprocessors, trafﬁc is often increased due to extra unused prefetch data. Prefetching in multiprocessors...

Dan Wallin, Erik Hagersten

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers