PMU Events for Ivy Bridge

PMU Events for Ivy Bridge

Hello guys!

I'm using Oprofile and also Perf to profile some benchmarks, I'm looking specifically for caching issues. I'm with the Intel SDM Volume 3 (from March 2013) as my guide for choosing what events to monitor... however it's being a pain..

The computer I'm doing the experiments is a i7 3630QM (that is, Ivy Bridge), so in the manual I'm looking in tables 19-1 and 19-5, the problem is: which events should I use to measure L1{D,I} cache events? What about L3 (LLC)? Sincerely, the events description of table 19-5 are more vague than the habitual.

Can anyone help on this?
César.

5 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Usually to answer this question, I download VTune and install it and see what metrics and events VTune uses.

In fact, that is what I would have to do to answer the question. Keeping track of events from chip to chip is one of the 'pains in the rear' that the VTune folks have to maintain. VTune may or may not have the metric you want, but that is the 1st place to check.

Pat

 

Usually to answer this question, I download VTune and install it and see what metrics and events VTune uses.

In fact, that is what I would have to do to answer the question. Keeping track of events from chip to chip is one of the 'pains in the rear' that the VTune folks have to maintain. VTune may or may not have the metric you want, but that is the 1st place to check.

Pat

 

Hi Patrick, thanks for your (both) answers.

I did what you suggested and although some of VTune's analysis aren't working in my machine (eg: Sandy/Ivy Bridge -> Memory Access ---> "Error: ... not aplicable to current machine microarchitecture") I've collected some event names that I guess one (or a combination) of them achieve what I want (measure L1, L2 and L3 cache hit/miss in Ivy Bridge), however I've some questions:

1) Can I use *only* these two events (below) to account for all stalls caused by L1D / L2 ?

CYCLE_ACTIVITY.STALLS_L1D_PENDING  
CYCLE_ACTIVITY.STALLS_L2_PENDING 

2) The description of the following event I could not understand. What is a unknown data source? 

"MEM_LOAD_UOPS_RETIRED.LLC_MISS_PS  --> Miss in last-level (L3) cache. Excludes Unknown data-source."

Thank you again for your help.

I found a pretty good explanation on how (and why this way) to measure L1, L2 and L3 "misses" on Ivy Bridge. The text is subsection B.3.2.3 - Memory Bound Characterization, of the Optimization Reference Manual (version July 2013).

However, I've some questions about the equations shown in this subsection. They account the percentage of *CYCLES* due to "misses" in several levels of the cache hierarchy, right? Should not these equations use CYCLE_ACTIVITY.CYCLES_LDM_PENDING instead of CYCLE_ACTIVITY.STALLS_LDM_PENDING ?

I'm looking forward to your comments.
Thanks, 

Leave a Comment

Please sign in to add a comment. Not a member? Join today