Profiling memory accesses - which counters overlap with which?

Profiling memory accesses - which counters overlap with which?

Hi!

I have some Perl scripts that implement some of the top-level vtune profiling for sandy bridge (and SB Xeon.)

Noticably:

* % cycles spent LLC miss: ~ 53%

* % cycles spent doing DTLB walks: ~ 10%

* % cycles spent accessing data modified by another core: ~ 3.5%

My question: when doing DTLB walks (DTLB_LOAD_MISSES.WALK_DURATION) and cycles spent hit'ing on data modified by another core (MEM_LOAD_UOPS_LLC_HIT_RETIRED.XSNP_HITM) - do these overlap in any way with the LLC_MISS (ie memory access - MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS) ? Partially overlap? Don't overlap at all?

I'm especially interested in whether the DTLB walks count towards MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS.

Thanks!

-adrian

7 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Hello Adrian,

I'd have to see the full equation for the metric. Can you post them? Even with the equation it might be hard to tell.

Pat

Hi,

The code is at github.com/erikarn/hwpmc/ . The equations match what's in this PDF:

http://download-software.intel.com/sites/landingpage/legacy/pdfs/Using_Intel_VTune_Amplifier_XE_on_2nd_Gen_Intel_Core_Family.pdf

Thanks!

-adrian

Hi,

The code is at github.com/erikarn/hwpmc/ . The equations match what's in this PDF:

http://download-software.intel.com/sites/landingpage/legacy/pdfs/Using_Intel_VTune_Amplifier_XE_on_2nd_Gen_Intel_Core_Family.pdf

Thanks!

-adrian

A 42 pages of images, unsearchable document. nice.

The PDF display for me has a bunch of images but the formulas are in text, so they're searchable.

bump?

Leave a Comment

Please sign in to add a comment. Not a member? Join today