ProfileMe: Hardware Support for Instruction-Level Profiling on Out-of-Order Processors

15 years 10 months ago

Download waldspurger.org

Profile data is valuable for identifying performance bottlenecks and guiding optimizations. Periodic sampling of a processor's performance monitoring hardware is an effective, unobtrusive way to obtain detailed profiles. Unfortunately, existing hardware simply counts events, such as cache misses and branch mispredictions, and cannot accurately attribute these events to instructions, especially on out-of-order machines. We propose an alternative approach, called ProfileMe, that samples instructions. As a sampled instruction moves through the processorpipeline, a detailed record of all interesting events and pipeline stage latencies is collected. ProfileMe also support paired sampling, which captures information about the interactions between concurrent instructions, revealing information about useful concurrency and the utilization of various pipeline stages while an instruction is in flight. We describe an inexpensive hardware implementation of ProfileMe, outline a variety of sof...

Jeffrey Dean, James E. Hicks, Carl A. Waldspurger,

Real-time Traffic

Hardware | MICRO 1997 | Performance Monitoring Hardware | Pipeline Stage Latencies | Pipeline Stages |

claim paper

Post Info
More Details (n/a)

Added	26 Aug 2010
Updated	26 Aug 2010
Type	Conference
Year	1997
Where	MICRO
Authors	Jeffrey Dean, James E. Hicks, Carl A. Waldspurger, William E. Weihl, George Z. Chrysos

Comments (0)

Sciweavers

ProfileMe: Hardware Support for Instruction-Level Profiling on Out-of-Order Processors

Hardware | MICRO 1997 | Performance Monitoring Hardware | Pipeline Stage Latencies | Pipeline Stages |

Explore & Download

Productivity Tools

Sciweavers