How to use Perf and import its result into VTune(TM) Amplifier XE?

Perf is an internal performance tool of Linux* operation system, the tool’s usage is very similar to OProfile, GProf and it uses (Performance Monitoring Unit) PMU to set performance counters before profiling target application then get information of elapsed CPU cycles, Instruction execution retired, Cache miss, Branch mispredict, etc after profiling target application..

For some customers who require to use Perf within VTune(TM) Amplifier XE to collect application's performance data, VTune Amplifier XE 2013 Update 17 integrates Perf’s function into the product,  the command is “amplxe--perf”, original VTune Amplifier’s command amplxe-cl can be used to import trace file into VTune Amplifier’s result. Here is an example:
1.    amplxe--perf record -o peter.perf -T --force-per-cpu -e cpu-cycles,instructions -- ./primes.icc
Determining primes from 1 - 100000 
Found 9592 primes
[ perf record: Woken up 3 times to write data ]
[ perf record: Captured and wrote 0.924 MB peter.perf (~40350 samples) ]

2.    amplxe-cl –import peter.perf –r r0001

Notes:
1.    Perf has been integrated in VTune Amplifier U17, it can support application's launch mode, as well as attach mode. For example, “amplxe--perf record -o peter1.perf -T --force-per-cpu -e cpu-cycles,instructions  -p <PID> sleep 10"
2.    Perf is PMU event-based sampling, so it cannot co-work with VTune’s EBS collector in one session. (Other system/OS profiling tools, custom collectors can co-work with VTune’s EBS collector – see this article)
3.    Perf’s results can only be imported into a new VTune’s result directory, the reason is point 2. 
4.    When Perf’s result has been imported into VTune, VTune GUI can open/display this result. However, Perf’s result can also be reported/displayed by VTune command, but this is restricted to be used. It means that only performance counters can be displayed – for example:
a)    “amplxe-cl -report hw-events -r r0001” can work, but
b)    “amplxe-cl -report hotspots -r r0001” cannot work.

For more complete information about compiler optimizations, see our Optimization Notice.