Event differences

Event differences

What are the differences between the UNC_L3_MISS.ANY (09_03H), LLC_MISSES (2E_41H), and MEM_LOAD_RETIRED.L3_MISS (CB_10H) events in how they determine what a L3 cache miss is for the i7 quad core processor (Family_Model 06_1EH)?

5 posts / 0 nouveau(x)
Dernière contribution
Reportez-vous à notre Notice d'optimisation pour plus d'informations sur les choix et l'optimisation des performances dans les produits logiciels Intel.

Hello heinrej,
UNC_L3_MISS.ANY (09_03h) counts all L3 misses in the uncore.
MEM_LOAD_RETIRED.LLC_MISS (cb_10h) counts 'Retired loads that miss the LLC cache'.
If the prefetchers bring the data in from memory so that an LLC_MISS is avoided then the MEM_LOAD_RETIRED.LLC_MISS will not increment.

You can test this by disabling the prefetchers in your BIOS (if the BIOS supports disabling the prefetchers).

With the prefetchers disabled you will see that the count of MEM_LOAD_RETIRED.LLC_MISS is very close to UNC_L3_MISS.ANY.

I can't test LLC_MISSES (2e_41h) on my system but you should easily be able to see if the prefetchers impact event 2e_41h.
Hope this helps,

If my bios does not support disabling the prefetchers, is there a way that I can disable them?

Hello heinerj,
There is no publicly disclosed method of disabling the prefetchers on Nehalem, Sandy bridge and similar chips.

Maybe you can explain to me why every instruction that references/uses a memory location, even in deep loops on an array of consecutive memory locations, causes a retired load miss in the L3 cache. Sorry about labelling the event wrong it should read MEM_LOAD_RETIRED.L3_MISS. To my knowledge when a miss occurs in the L3 cache it brings in a page from physical memory and loads the L3 cache (cache line fill), as well as the L2 or L1 cache. At which point the subsequent memory reads should hit in the L3 cache and not miss (locality). However, I have a for loop that reads 2MB of consecutive memory and each pass of the for loop causes 6 L3 cache misses, three of these misses are caused from loading the index value from memory, reading the end condition from memory, and storing the incremented index value in memory.One from reading the destination address. And two from reading a storage variable and writing an incremented storage value. So each memory read/write causes a L3 miss.
for loop described below:
pt = 0x2300000;
for (l = 0; l < read_2MB; l++) //this causes3 misses
temp += pt[l];//this causes3 misses

Laisser un commentaire

Veuillez ouvrir une session pour ajouter un commentaire. Pas encore membre ? Rejoignez-nous dès aujourd’hui