Display source view with performance data in command line by using VTune(TM) Amplifier XE

Intel(R) VTune(TM) Amplifier XE can profile user application and report hotspots, where application consumes high CPU time. 
Usually after performance data collecting, the user can open result in appliction GUI and display hot functions, if the user wants to know hot lines of hot functions - just doubl-click on specific hot function then go to source view which displays all hot lines with performance data.
The problem is that piror product only supported hotline view in GUI, it was not supported in command line (only hot functions are available). Since VTune(TM) Amplifier XE 2013 Update 9, the product supports of displaying assembly / source view in command line. It is meaningful for some users to do auto-test (focus on critcal source lines), that is, write VTune command line in their test script. 
Use the "source-object" option to implement above goals. Here are examples of usage mode:
1. Collecting performance data
# amplxe-cl -collect advanced-hotspots -- ./primes.icc
2. Displaying hotlines in specific funuction
a. # amplxe-cl -report hotspots -r r001ah/
amplxe: Using result path `/home/peter/problem_report/r001ah'
amplxe: Executing actions 50 % Generating a report
Function                 Module              CPU Time:Self
-----------------------  ------------------  -------------
findPrimes               primes.icc                  2.089
pthread_mutex_unlock     libpthread-2.12.so          0.002
__dentry_open            vmlinux                     0.001
...
b.  # amplxe-cl -report hotspots -source-object function=findPrimes -r r001ah/
Source Line  Source                                                            CPU Time:Self
-----------  ----------------------------------------------------------------  -------------
...
387              for (number = start; number < end; number += stride)           
388              {                                                              
389                  factor = 3;                                                
390                                                                             
391                  while ((number % factor) != 0 ) factor += 2;                      2.088
392                                                                             
393                  if ( factor == number )                                           0.001
394                  {                                                          
395                      pthread_mutex_lock (&cs); 
...
3. Also we can display assembly code with performance data, in command line
# amplxe-cl -report hotspots -source-object function=findPrimes -group-by basic-block,address -r r001ah/
Basic Block  Assembly                                 Source Line  CPU Time:Self
-----------  ---------------------------------------  -----------  -------------
...
0x4008e0     Block 13
 0x4008e0     movq  $0x3, -0x18(%rbp)                 389
0x4008e8     Block 14                                                      2.027
 0x4008e8     movq  -0x20(%rbp), %rax                 391                  0.003
 0x4008ec     movq  -0x18(%rbp), %rdx                 391
 0x4008f0     movq  %rdx, -0x10(%rbp)                 391
 0x4008f4     cqo                                     391                  0.070
 0x4008f6     movq  -0x10(%rbp), %rcx                 391
 0x4008fa     idiv %rcx                               391                  0.001
 0x4008fd     test %rdx, %rdx                         391                  1.880
 0x400900     jz 0x400911 <Block 16>                  391                  0.073
0x400902     Block 15                                                      0.061
 0x400902     mov $0x2, %eax                          391                  0.002
 0x400907     addq  -0x18(%rbp), %rax                 391                  0.001
 0x40090b     movq  %rax, -0x18(%rbp)                 391
...
For more complete information about compiler optimizations, see our Optimization Notice.

Comments


It could be version issue.

It could be version issue.
# source /opt/intel/vtune_amplifier_xe_2013/amplxe-vars.sh
Copyright (C) 2009-2013 Intel Corporation. All rights reserved.
Intel(R) VTune(TM) Amplifier XE 2013 (build 320457)

# amplxe-cl -collect advanced-hotspots -- ./primes.icc
amplxe: Collection started. To stop the collection, either press CTRL-C or enter from another console window: amplxe-cl -r /home/peter/problem_report/r001ah -command stop.
Determining primes from 1 - 100000
Found 9592 primes
amplxe: Collection stopped.
amplxe: Using result path `/home/peter/problem_report/r001ah'
amplxe: Executing actions 16 % Resolving information for `ld-2.12.so'
amplxe: Warning: Cannot locate debugging symbols for file `/lib64/libpthread-2.12.so'.
amplxe: Warning: Cannot locate debugging symbols for file `/lib64/ld-2.12.so'.
amplxe: Executing actions 50 % Generating a report

Collection and Platform Info
----------------------------
Parameter r001ah
------------------------ ---------------------------------------------------------------------------
Application Command Line ./primes.icc
User Name root
Operating System 2.6.32-71.el6.x86_64 Red Hat Enterprise Linux Server release 6.0 (Santiago)
Computer Name snb01.sh.intel.com
Result Size 4920831

CPU
---
Parameter r001ah
----------------- ---------------------------------------
Name Intel(R) Core(TM) Processor 2xxx Series
Frequency 3392549923
Logical CPU Count 8

Summary
-------
Elapsed Time: 0.880
CPU Usage: 2.383

Event summary
-------------
Hardware Event Type Hardware Event Count:Self Hardware Event Sample Count:Self Events Per Sample
------------------------ ------------------------- -------------------------------- -----------------
CPU_CLK_UNHALTED.THREAD 7524200000 2213 3400000
CPU_CLK_UNHALTED.REF_TSC 7085600000 2084 3400000
INST_RETIRED.ANY 2737000000 805 3400000
amplxe: Executing actions 100 % done
[root@snb01 problem_report]# amplxe-cl -report hotspots -r r001ah/
amplxe: Using result path `/home/peter/problem_report/r001ah'
amplxe: Executing actions 50 % Generating a report
Function Module CPU Time:Self
----------------------- ------------------ -------------
findPrimes primes.icc 2.083
__do_softirq vmlinux 0.001
__pthread_mutex_lock libpthread-2.12.so 0.001
__rcu_process_callbacks vmlinux 0.001
_spin_unlock_irqrestore vmlinux 0.001
do_lookup_x ld-2.12.so 0.001
strncpy_from_user vmlinux 0.001
amplxe: Executing actions 100 % done

# amplxe-cl -report hotspots -source-object function=findPrimes -group-by basic-block,address -r r001ah/
amplxe: Using result path `/home/peter/problem_report/r001ah'
amplxe: Executing actions 50 % Generating a report
Basic Block Assembly Source Line CPU Time:Self
----------- --------------------------------------- ----------- -------------
0x400804 Block 1
0x400804 pushq %rbp 370
0x400805 mov %rsp, %rbp 370
0x400808 sub $0x60, %rsp 370
0x40080c movq %rdi, -0x58(%rbp) 370
0x400810 movq -0x58(%rbp), %rax 371
0x400814 movq (%rax), %rax 371
0x400817 movq %rax, -0x50(%rbp) 371
0x40081b movq $0x61a8, -0x48(%rbp) 375
0x400823 movq -0x48(%rbp), %rax 379
0x400827 mov $0x1, %edx 379
0x40082c mov $0x0, %ecx 379
0x400831 test %rax, %rax 379
0x400834 cmovl %edx, %ecx 379
0x400837 movsxd %ecx, %rcx 379
0x40083a addq -0x48(%rbp), %rcx 379
0x40083e sar $0x1, %rcx 379
0x400841 imul $0x2, %rcx, %rax 379
0x400845 neg %rax 379
0x400848 addq -0x48(%rbp), %rax 379
0x40084c jz 0x40085b 379
0x40084e Block 2
0x40084e mov $0x1, %eax 379
0x400853 addq -0x48(%rbp), %rax 379
0x400857 movq %rax, -0x48(%rbp) 379
0x40085b Block 3
0x40085b movq -0x48(%rbp), %rax 381
0x40085f imulq -0x50(%rbp), %rax 381
0x400864 inc %rax 381
0x400867 movq %rax, -0x40(%rbp) 381
0x40086b movq -0x50(%rbp), %rax 382
0x40086f cmp $0x3, %rax 382
0x400873 jnz 0x400899 382
0x400875 Block 4
0x400875 movq $0x186a0, -0x38(%rbp) 382
0x40087d Block 5
0x40087d movq -0x38(%rbp), %rax 382
0x400881 movq %rax, -0x30(%rbp) 382
0x400885 movq $0x2, -0x28(%rbp) 384
0x40088d movq -0x40(%rbp), %rax 385
0x400891 cmp $0x1, %rax 385
0x400895 jz 0x4008ac 385
0x400897 Block 6
0x400897 jmp 0x4008b8 385
0x400899 Block 7
0x400899 movq -0x48(%rbp), %rax 383
0x40089d imulq -0x50(%rbp), %rax 383
0x4008a2 addq -0x48(%rbp), %rax 383
0x4008a6 movq %rax, -0x38(%rbp) 382
0x4008aa jmp 0x40087d 382
0x4008ac Block 8
0x4008ac movq -0x28(%rbp), %rax 385
0x4008b0 addq -0x40(%rbp), %rax 385
0x4008b4 movq %rax, -0x40(%rbp) 385
0x4008b8 Block 9
0x4008b8 movq -0x40(%rbp), %rax 387
0x4008bc movq %rax, -0x20(%rbp) 387
0x4008c0 Block 10
0x4008c0 movq -0x20(%rbp), %rax 387
0x4008c4 movq -0x30(%rbp), %rdx 387
0x4008c8 cmp %rdx, %rax 387
0x4008cb jl 0x4008e0 387
0x4008cd Block 11
0x4008cd jmp 0x400970 387
0x4008d2 Block 12
0x4008d2 movq -0x28(%rbp), %rax 387
0x4008d6 addq -0x20(%rbp), %rax 387
0x4008da movq %rax, -0x20(%rbp) 387
0x4008de jmp 0x4008c0 387
0x4008e0 Block 13
0x4008e0 movq $0x3, -0x18(%rbp) 389
0x4008e8 Block 14 2.007
0x4008e8 movq -0x20(%rbp), %rax 391 0.001
0x4008ec movq -0x18(%rbp), %rdx 391
0x4008f0 movq %rdx, -0x10(%rbp) 391
0x4008f4 cqo 391 0.068
0x4008f6 movq -0x10(%rbp), %rcx 391 0.001
0x4008fa idiv %rcx 391
0x4008fd test %rdx, %rdx 391 1.872
0x400900 jz 0x400911 391 0.065
0x400902 Block 15 0.075
0x400902 mov $0x2, %eax 391 0.002
0x400907 addq -0x18(%rbp), %rax 391
0x40090b movq %rax, -0x18(%rbp) 391
0x40090f jmp 0x4008e8 391 0.073
0x400911 Block 16
0x400911 movq -0x18(%rbp), %rax 393
0x400915 movq -0x20(%rbp), %rdx 393
0x400919 cmp %rdx, %rax 393
0x40091c jnz 0x4008d2 393
0x40091e Block 17
0x40091e mov $0x6c47a8, %eax 395
0x400923 mov %rax, %rdi 395
0x400926 callq 0x4006d8 395
0x40092b Block 18
0x40092b movl %eax, -0x60(%rbp) 395
0x40092e movq 0x2c3e6b(%rip), %rax 396
0x400935 imul $0x8, %rax, %rax 396
0x400939 mov $0x6012a0, %edx 396
0x40093e add %rax, %rdx 396
0x400941 movq -0x20(%rbp), %rax 396
0x400945 movq %rax, (%rdx) 396
0x400948 mov $0x1, %eax 397
0x40094d addq 0x2c3e4c(%rip), %rax 397
0x400954 movq %rax, 0x2c3e45(%rip) 397
0x40095b mov $0x6c47a8, %eax 398
0x400960 mov %rax, %rdi 398
0x400963 callq 0x400708 398
0x400968 Block 19
0x400968 movl %eax, -0x5c(%rbp) 398
0x40096b jmp 0x4008d2 398
0x400970 Block 20
0x400970 mov $0x0, %eax 401
0x400975 leaveq 401
0x400976 retq 401
0x400977 Block 21
0x400977 nop 401
amplxe: Executing actions 100 % done


i got two errors while

i got two errors while executing the above commands

1.

amplxe-cl -collect advanced-hotspots -r profileadv ./clustalw-mpi -infile=CFTR.input -newtree=CFTR.mytree
amplxe: Error: Cannot start data collection. nmi_watchdog interrupt capability is enabled on your system, which prevents collecting accurate event-based sampling data. Please disable nmi_watchdog interrupt or see the Troubleshooting section of the product documentation for details.
amplxe: Error: Cannot enable Hardware Event-based Sampling: problem with the driver (sep*/sepdrv*). Check that the driver is running and the driver group is in the current user group list. See "Building and Managing the Sampling Driver" help topic for further details.
amplxe: Warning: On some systems based on the Intel microarchitecture code name Nehalem / Westmere with C-states enabled, this analysis type may cause system hanging due to a known hardware issue (see errata AAJ134 in http://download.intel.com/design/processor/specupdt/320836.pdf). To avoid this situation, disable all "Cn(ACPI Cn) report to OS" BIOS options before sampling with VTune Amplifier on such systems.

and
second for seeing the source code

2.
[kiran@sdp1 clustalw-mpi-0.13]$ amplxe-cl -R hotspots -source-object function=calc_score -r profile/
amplxe: Using result path `/home/kiran/installdir/clustalw-mpi-profile/clustalw-mpi-0.13/profile'
amplxe: Executing actions 50 % Generating a report

amplxe: Error: An internal error has occurred. Our apologies for this inconvenience. Please gather a description of the steps leading up to the problem and contact the Intel customer support team.

vcs/dvt/src/dicer/dvt_dicer_provider_query_impl.cpp(332): gen_helpers2::error_code_t dvt6_1::ProviderQueryImpl::executeQueryImpl(gen_helpers2::sptr_t &, msngr2::IProgress *, int): IsNot Valid State
amplxe: Executing actions 100 % done
amplxe: Error: Error 0x40000024 (Reporter error)
[kiran@sdp1 clustalw-mpi-0.13]$

please help


[kiran@sdp1 clustalw-mpi-0.13

[kiran@sdp1 clustalw-mpi-0.13]$ amplxe-cl -report hotspots -source-object function=prfscore -r rhs000/
amplxe: Using result path `/home/kiran/installdir/clustalw-mpi-profile/clustalw-mpi-0.13/rhs000'
amplxe: Executing actions 50 % Generating a report

amplxe: Error: An internal error has occurred. Our apologies for this inconvenience. Please gather a description of the steps leading up to the problem and contact the Intel customer support team.

same error please suggest me the proper solution

vcs/dvt/src/dicer/dvt_dicer_provider_query_impl.cpp(332): gen_helpers2::error_code_t dvt6_1::ProviderQueryImpl::executeQueryImpl(gen_helpers2::sptr_t &, msngr2::IProgress *, int): IsNot Valid State
amplxe: Executing actions 100 % done
amplxe: Error: Error 0x40000024 (Reporter error)


I found it was version

I found it was version question, so I just install the newest version, but I still can't use the "-source-object function", and I would get error like following:

amplxe: Error: An internal error has occurred. Our apologies for this inconvenience. Please gather a description of the steps leading up to the problem and contact the Intel customer support team.

vcs/dvt/src/dicer/dvt_dicer_provider_query_impl.cpp(335): gen_helpers2::error_code_t dvt6_1::ProviderQueryImpl::executeQueryImpl(gen_helpers2::sptr_t &, msngr2::IProgress *, int): IsNot Valid State
amplxe: Executing actions 100 % done
amplxe: Error: Error 0x40000024 (Reporter error)

What's more, should I specify the source code directory? If so, which command should I use to specify? thanks~


Hello, I also want to use the

Hello, I also want to use the command line of vtune to analyse source code , but when I use "amplxe-cl -report hotspots -source-object function=findPrimes -r r001ah/", it just mind me that "Unknown option: -source-object". Do you know why? Thanks in advance.
Best
Emily