How to use Perf and import its result into VTune(TM) Amplifier XE?

Perf is an internal performance tool of Linux* operation system, the tool’s usage is very similar to OProfile, GProf and it uses (Performance Monitoring Unit) PMU to set performance counters before profiling target application then get information of elapsed CPU cycles, Instruction execution retired, Cache miss, Branch mispredict, etc after profiling target application..

For some customers who require to use Perf within VTune(TM) Amplifier XE to collect application's performance data, VTune Amplifier XE 2013 Update 17 integrates Perf’s function into the product,  the command is “amplxe--perf”, original VTune Amplifier’s command amplxe-cl can be used to import trace file into VTune Amplifier’s result. Here is an example:
1.    amplxe--perf record -o peter.perf -T --force-per-cpu -e cpu-cycles,instructions -- ./primes.icc
Determining primes from 1 - 100000 
Found 9592 primes
[ perf record: Woken up 3 times to write data ]
[ perf record: Captured and wrote 0.924 MB peter.perf (~40350 samples) ]

2.    amplxe-cl –import peter.perf –r r0001

Notes:
1.    Perf has been integrated in VTune Amplifier U17, it can support application's launch mode, as well as attach mode. For example, “amplxe--perf record -o peter1.perf -T --force-per-cpu -e cpu-cycles,instructions  -p <PID> sleep 10"
2.    Perf is PMU event-based sampling, so it cannot co-work with VTune’s EBS collector in one session. (Other system/OS profiling tools, custom collectors can co-work with VTune’s EBS collector – see this article)
3.    Perf’s results can only be imported into a new VTune’s result directory, the reason is point 2. 
4.    When Perf’s result has been imported into VTune, VTune GUI can open/display this result. However, Perf’s result can also be reported/displayed by VTune command, but this is restricted to be used. It means that only performance counters can be displayed – for example:
a)    “amplxe-cl -report hw-events -r r0001” can work, but
b)    “amplxe-cl -report hotspots -r r0001” cannot work.

For more complete information about compiler optimizations, see our Optimization Notice.

2 comments

Top
Hal G. (Intel)'s picture

Questions regarding Intel® VTune™ Amplifier should be posted to the forums here: https://software.intel.com/en-us/forums/intel-vtune-amplifier-xe

 

Questions posted to Articles may or may not be responded to.

 

Regards, Hal

Intel(R) Developer Zone Support

https://software.intel.com
*Other names and brands may be claimed as the property of others.

shahnejat, ahmad's picture

Hi Peter,

I am using "Intel System Studio-->Intel VTune 2018" to profile and derive the control flow dependencies by making use of the Intel_PT PMU under the system:

Kernel: 4.15.0-13-generic, 64bit Ubuntu

CPU: Intel® Core™ i7-7820X @ 3.60GHz × 16 

I started with the following commands:

1- amplxe-perf record -o a.perf -T -e intel_pt// -- ps

  PID TTY          TIME CMD
21471 pts/1    00:00:00 amplxe-perf
21472 pts/1    00:00:00 ps
58693 pts/1    00:00:00 sudo
58694 pts/1    00:00:00 su
58695 pts/1    00:00:00 bash
[ perf record: Woken up 2 times to write data ]
[ perf record: Captured and wrote 3.154 MB a.perf ]

2- amplxe-cl -import a.perf -r folder

amplxe: Importing a new result 100 % done                                      
amplxe: Using result path `/home/amad/May2/folder'
amplxe: Executing actions 12 % Loading 'a.perf' file                           
amplxe: Error: Cannot load data file `/home/amad/May2/folder/data.0/a.perf' (Data file is corrupted).
amplxe: Executing actions 50 % done                                            
amplxe: Error: 0x4000001e (Cannot load raw collector data)

Although intel_pt data has not been successfully imported, the data for other kernel PMU events like "cpu-cycles" and "instructions" could be properly handled:

1- amplxe-perf record -o p.perf -T -e cpu-cycles,instructions -- ps

  PID TTY          TIME CMD
 8410 pts/0    00:00:00 sudo
 8458 pts/0    00:00:00 amplxe-perf
 8467 pts/0    00:00:00 ps
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.024 MB p.perf (96 samples) ]

2- amplxe-cl -import p.perf -r r2

amplxe: Importing a new result 100 % done                                      
amplxe: Using result path `/home/amad/r2'
amplxe: Executing actions 19 % Resolving information for `libprocps.so.6.0.0'  
amplxe: Warning: Cannot locate debugging information for file `/lib/x86_64-linux-gnu/libprocps.so.6.0.0'.
amplxe: Executing actions 21 % Resolving information for `vmlinux'             
amplxe: Warning: Cannot locate debugging information for the Linux kernel. Source-level analysis will not be possible. Function-level analysis will be limited to kernel symbol tables. See the Enabling Linux Kernel Analysis topic in the product online help for instructions.
amplxe: Executing actions 75 % Generating a report                             
Collection and Platform Info
----------------------------
Parameter         r2                                  
----------------  ------------------------------------
Operating System  4.15.0-13-generic                   
Computer Name     amad-pc                             
Result Size       2766877                             
Collector Type    Driverless Perf per-process sampling

CPU
---
Parameter          r2        
-----------------  ----------
Frequency          3600000000
Logical CPU Count  16        

Summary
-------
Elapsed Time:             0.011
Paused Time:              0.0  
CPU Time:                 0.011
Average CPU Utilization:  0.897

Event summary
-------------
Hardware Event Type  Hardware Event Count:Self  Hardware Event Sample Count:Self  Events Per Sample
-------------------  -------------------------  --------------------------------  -----------------
cpu-cycles                            40521584                                45  4000             
instructions                          36302909                                51  4000             
amplxe: Executing actions 100 % done

What is wrong with Intel_pt data?

Thanks

 

 

-- 

A. Shahnejat 

Peter Wang (Intel)'s picture

If you use VTune Amplifier XE 2015 Beta, use "amplxe-perf" instead of "amplxe--perf"

Add a Comment

Have a technical question? Visit our forums. Have site or software product issues? Contact support.