PEBS Counters and Linux "perf" utility

PEBS Counters and Linux "perf" utility

Portrait de Sanath Jayasena


I want to know if it is possible to correctly access the PEBS counters via the "perf" utility in Linux? For example, suppose on nehalem I want to collect the number of MEM_LOAD_RETIRED.L1D_HIT events (Code=0xcb, UMask=0x01) during running my program "prog" (and a few other similar events). When I run the perf as follows, I get some numbers (sometimes "scaled" if I try to get a few event numbers):

sudo perf stat -e r01cb ./prog

Are the numbers I get reasonably correct? I ask this because I am not sure I clearly understand the PEBS events.



4 posts / 0 nouveau(x)
Dernière contribution
Reportez-vous à notre Notice d'optimisation pour plus d'informations sur les choix et l'optimisation des performances dans les produits logiciels Intel.
Portrait de Hussam Mousa (Intel)

Hello Sanath,

PEBS (Precise Event Based Sampling) is a feature available to a subset of events which allows the hardware to collect additionalinformation very close to the exact time the configured event overflowed. This presents theanalysis tools whith susbstantially more accurate information since the alternative is to wait for a software interrupt to collect this information, typically hundreds of cycles later.The additional collected information are stored in a special PEBS buffer and retrieved by the tool (in this case perf_events) later.

To use with perf you need to append ":pp" to the event coding. so it would look like this:
sudo perf stat -e r01cb:pp ./prog

The results will typicallybe more preciseif you are able to trace to a specific line of code. so perf stat wouldn't utilize this precision, but perf record would.

Please let me know if you need additional details. Be sure to reference the kernel version and specific cpu model number you are using (cat /proc/cpuinfo)

Portrait de Sanath Jayasena

Thanks Hussam for the clarification. Here is one questions I have now.

[System: Intel Core i7, sandy bridge, cpu family 6, model 42; linux kernel 3.0.0-24-generic-pae / Ubuntu 11.10; perf version 3.0.38]

For PEBS events, even without the ":pp" or ":p" suffix, perf stat will give some numbers. I noted that for some cases I tried, I did not see much differences between using ":pp" and not using. Still do you recommend to always use ":pp" to be safe?

Here is an example:

%> sudo perf stat -e r00c0,r01c0,r01c0:p,r01c0:pp ./prog 2
N= 100000000 : NumThreads= 2 : Time= 439.044 msec

Performance counter stats for './prog 2':

10,707,296,495 r00c0 [49.96%]
10,671,452,933 r01c0 [50.13%]
10,676,277,604 r01c0:p [25.11%]
10,704,588,878 r01c0:pp [25.01%]

2.192624454 seconds time elapsed

Portrait de Hussam Mousa (Intel)

Hi Sanath,

The PEBS and non PEBS version of events will both producevery similar counts ofthe number of cycles (or any other event).

The added accuracy from PEBS applies to identifying the active "Instruction Pointer" which can be traced to a specific line of code. This is typically collected with 'perf record'.

Connectez-vous pour laisser un commentaire.