Problem with VTune APIs resume/pause and -mavx option

Problem with VTune APIs resume/pause and -mavx option

Hi there,
I'm trying to implement the VTune APIs resume/pause in my code. I'm following the example from here:

http://software.intel.com/en-us/articles/how-to-call-resume-and-pause-ap...
(case 1)

Actually, I wrote the same code of case 1 and it works, but if I add "-mavx" to

icpc -g test.cpp -I/opt/intel/vtune_amplifier_xe_2011/include /opt/intel/vtune_amplifier_xe_2011/lib64/libittnotify.a -lpthread -o test

VTune doesn't collect anything (it doesn't resume, I think). I tried -msse3 and similar options and the problem is still there. I tried to change the code, but no hope. My conclusion is that it is something wrong with the -mavx and the VTune APIs (note that VTune collects data if I remove the APIs).

Any solution? I know that for the moment I can remove -mavx, but this is part of a bigger code, so the option must be there when running on Sandy Bridge.

I'm running on Linux64 and the compiler version is:

Intel(R) C++ Intel(R) 64 Compiler XE for applications running on Intel(R) 64, Version 13.0.1.117 Build 20121010<br />Copyright (C) 1985-2012 Intel Corporation. All rights reserved.

Best regards,

Alfio

4 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

The trick is that Intel Composer XE compiler will do "powerful" optimization if added "-mavx" option. Original example code turned off optimization switchers, if you used "-g" option and without others. The advantages of built program could be "do nothing" after optimizing - if the compiler analyzes there is no result to be used, for example. In this case, application terminates quickly and VTune Amplifier XE cannot capture any result.

If you use "-O0" option, you will see expected result.

Another way is, you can add "sleep (1)" following the statements of switching to resume/pause state in example code, use VTune to analyze again. You will see CPU running state and "Empty" state in "timeline" report (CPU utilization is low for "sleep").

Hello,
thanks for the answer. You are right for the linked example, but things are a little bit more complicated in my code....
As I said in my email, I'm trying pause/resume in a big code, which takes (the part I'm interested on) for sure more than 10 seconds.
It is a Fortran code and I'm using these options to compile (with -mavx as you suggested):

O2 -g -openmp -mavx -vec-report2 -warn -funroll-loops -fpp -free -nogen-interfaces

and for linking I have also added:

-static -Wl,-upthread_attr_setstack,-udlclose
(the -Wl,-upthread_attr_setstack,-udlclose is needed to get VTune working)

Then I run VTune with: amplxe-cl -collect hotspots -start-paused --
This is the output:

Executing actions 50 % Generating a report
Summary
-------

Elapsed Time: 23.792
CPU Time: 0
CPU Usage: 0
Executing actions 100 % done

and I don't see anything in the GUI.
Further investigation shows that the culprit is the static linking. If I remove it, I get:

Executing actions 35 % Resolving information for `libc-2.11.3.so'
Warning: Cannot locate symbols for file `/lib64/libc-2.11.3.so'.
Executing actions 35 % Resolving information for `libtpsstool.so'
Warning: Cannot locate symbols for file `/opt/intel/vtune_amplifier_xe_2013/lib64/libtpsstool.so'.
Executing actions 36 % Resolving information for `libpthread-2.11.3.so'
Warning: Cannot locate symbols for file `/lib64/libpthread-2.11.3.so'.
Executing actions 37 % Resolving information for `libittnotify.so'
Warning: Cannot locate file `[stack]'.
Executing actions 50 % Generating a report
Summary
-------

Elapsed Time: 25.509
CPU Time: 657.411
CPU Usage: 31.844
Executing actions 100 % done

The output is OK.
Therefore I will remove the -static option for the VTune tests.

Best regards,

Alfio

"-static" option is not allowed for VTune, the tool needs to do dynamic instrumentation when analyzing.

Thanks for the update.

Leave a Comment

Please sign in to add a comment. Not a member? Join today