Monitoring DTLB events?

Monitoring DTLB events?


How much detail can I drill into with the TLB performance counters? In particular, I'd like to see how effective the 2mb/1gb TLB entries are. Are there any counters that let me see hits/misses for the TLB based on the page table entry size?



6 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Did try to consult E3 or E5 processor datasheets?

We do need to know which chip you are using.

I need to look through SDM.


It's more a general question than a chipset specific question, as I'm interested on a variety of platforms - anything Nehalem and later.

But my immediate need is for Sandy Bridge Xeon (E5-2650). I'm also interested for Ivy Bridge and the server version of Ivy Bridge that I hear is coming out soon.

I did read the SDM chapter on performance counting. DTLB_LOAD_MISSES.* is kind of there. It does have 4k and 2M/4M page walks but not 1G page walks. It also only has the page size specifics in table 19-2 (Non-arch perf events, 4th generation intel core processors) and nothing in the SB or SB Xeon tables about this. It also has DTLB_LOAD_MISSES.WALK_DURATION which is the cycles busy doing a walk; I want to be able to filter _that_ by the page size if possible.

In any case, being able to count the 1G page walks would be really, really helpful.



Why do you need 1gb page walks cycles latency?Do you have a large amount of physical ram or does your application needs so much memory to utilize large pages? 

Leave a Comment

Please sign in to add a comment. Not a member? Join today