4,580 Posts served
11,094 Conversations started
- Academic

- Android

- Art, Music, & Animation

- Embedded Computing

- Events

- Game Development

- Graphics & Media

- Intel SW Partner Program

- Intel® AppUp Developer Program

- Manageability & Security

- Mobility

- Open Source

- Parallel Programming

- Performance and Optimization

- Power Efficiency

- Server

- Site News & Announcements

- Software Tools

- Ultrabook

- Association for Computing Machinery TechNews (ACM)
- Go Parallel! (Dr. Dobbs)
- HPCwire (Tabor Communications, Inc.)
- insideHPC (John West)
- Joe Duffy's Weblog (Microsoft)
- Microsoft Parallel Programming Development Center (Microsoft Germany)
- MultiCoreInfo.com
- scalability.org (Scalable Informatics)
- Software Dev Blog (Intel Germany)
- Soft Talk Blog (Intel United Kingdom)
- The Moth (Microsoft)
Utilizing load latency event in performance monitoring to get line fill buffer breakdown
By Rajshree Chabukswar (Intel) (1 posts) on November 11, 2010 at 1:27 pm
Utilizing load latency event in performance monitoring
Mike Chynoweth talked about utilizing utilizing performance monitoring events to identify the source of the load in memory hierarchy in his blog
http://software.intel.com/en-us/blogs/2010/09/30/utilizing-performance-monitoring-events-to-find-problematic-loads-due-to-latency-in-the-memory-hierarchy/
In this blog, we are going to look at how we can utilize the load latency capability offered by performance monitoring to identify the latency on the data sources. The feature we have experimented for this capability is to help identify and estimate how we can break down the load sources further when the data request is satisfied from Line Fill Buffer (LFB).
Load latency samples on a smaller fraction of the total loads. The loads to be sampled on are selected by a complex internal mechanism.
The information that load latency offers include data sources and the latency value observed at each data source. Using this information, we can estimate the potential data sources equivalent (based on latency values) when significant samples come from line fill buffers. A load that hits in the LFB means that a previous hardware prefetch, load or store has already missed the L1D on an address contained on the same cache line and it has allocated a fill buffer for that cache line. The latency for our immediate demand load is variable since it hits in the existing line fill buffer. When we see significant samples coming from LFB, the technique below helps identify the potential data sources using the actual latency values observed on the LFB samples.
As shown in example below, ~35% of total samples come from fill buffers
Using the latency values on the fill buffer data source, we can put an estimate based on latency on what the approximate data source would be as shown below. (Note that this is just an estimate based on actual latency values observed). As seen the the picture below, 13% of sample from LFB had latency equivalent to mid-level cache, 77% had latency equivalent to last-level cache.
Categories: Uncategorized
Tags: monitoring
For more complete information about compiler optimizations, see our Optimization Notice.




Michael Chynoweth (Intel)
707
http://software.intel.com/en-us/articles/intel-performance-bottleneck-analyzer/
Please download and tell us what you think.
Thanks,
Mike