Classifying D-TLB misses using Data access profiling

Classifying D-TLB misses using Data access profiling

Hello,I need to estimate amount of DTLB misses that are to heap region versus those due to accesses to stack region. In the user guide of the PTU the Data Access Profiling section (2.5) mentions that PTU can figure out linear address of an memory operand for an event. If linear address of an event can be found then it seems possible to figure out whether the access goes to dynamically allocated region or not. However, I am not really clear how can I use this facilty of PTU to calssify the DTLB misses as mentioned above. Can any body provide me some leads on this?ThanksArka

4 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.
Best Reply

Hi Arka,

Although Data Access profiling in PTU can reconstruct liner addresses at the
samples of precise events, it is not able to distinguish heap accesses from
stack and static ones.

possible PTU tries to match addresses to static data ( could be seen in data
view to changing granularity to Data Objects) but its sampling collection doesnt track dynamic allocations so it cant distingwish heap accesses.

However, you can try to distigwish heap from other addresses by yourself looking into what virtual address address region system uses for heap.

Also you have
not said on what CPU you are doing measurements. Make sure that DTLB misses
events you use are precise ones.

Hope it helps.


Thanks a lot, Julia. I am using Intel Sandybridge architecture. Sorry I missed that information in previous e-mail. If I am not mistaken, DTLB Miss event is a precide one on Sandybridge.And I do agree with you that by looking at the linear address of the access it can be classified as access to stack or non-stack region fairly accurately, given that stack is contained within a specific part of the linear address space of a process (at least in Linux x86-64). However, I am bit confused on where exactly I should look for the address that caused the event and where should I put my instrumentation/code to classify that event as stack or non-stack. I would really appreciate a bit hand-holding in this regard. FYI: I am using Linux x86-64 and I prefer to work through the command line interface rather than the GUI.Thanks again,Arka

Arka, you are right there are precise events
starting from MEM_UOPS_RETIRED.STLB_MISS.. (or the like) (it was a long time since i worked with them). As soon
as you choose event to collect it tells if it is precise or not.

Don't forget to set "Enable data profiling" when start collection from GUI.

Linear data addresses reported with granularity of the
cacheline; and then offset in the cacheline.

They could be seen in GUI in so-called Memory Hotspot
pane (available in Data Access View; check in user manual) or in the command line by vtdpview.exe (simply run
vtdpview for help and consult with user manual )

Overall - i recommend to read user manual - there are chapters re- command line and GUI of data profiling; as well as generic ones about PTU organization that help undertand the logic under collection and viewing.

Initially it might be easier to do things through GUi and as soon as you get some experience - you can swicth to command line and do your own analysis based on it output.


Leave a Comment

Please sign in to add a comment. Not a member? Join today