Display inline functions in hotspots report?

When you compile your code with the Intel® C++ compiler at optimization levels 2 or 3 (e.g., -O2 or -O3), the compiler may inline functions based on their size and called frequency.  Without additional symbol information support, the VTune™ Amplifier XE will not display these functions in the Hotspots report.  In the following example, main() calls ProcessFile() which calls ProcessBuffer().  ProcessBuffer() consumes most of the sample app's time, but the Hotspots report will only display performance information for the function main():

[shell]# amplxe-cl -collect hotspots -- ./gsexample2a datafile.txt
# amplxe-cl -report hotspots
Using result path `/opt/intel/vtune/samples/gsexample/r000hs'
Executing actions 75 % Generating a report
Function Module CPU Time
-------- ----------- --------
main gsexample2a 9.260
Executing actions 100 % done [/shell]


A temporary solution is to disable "inling" when using "-O2" or "-O3" and collecting performance data.  See this article for more information.

Beginning with Update 7 of the VTuneAmplifier XE 2011 (released Dec. 28, 2011), the hotspots report can display performance information for inlined functions with the addition of the -inline-debug-info compiler option (use Intel® Composer XE 2011 SP1 (12.1.333) or later).  For example, modifying the CFLAGS in the Makefile by adding this option yields, "CFLAGS=-O2 -g -inline-debug-info"

Now when performance data is collected, all functions consuming time are displayed:

[shell]# amplxe-cl -collect hotspots -- ./gsexample2a datafile.txt
# amplxe-cl -report hotspots
Using result path `/opt/intel/vtune/samples/gsexample/r001hs'
Executing actions 75 % Generating a report Function Module CPU Time
-------------------- ----------- --------
ProcessFile gsexample2a 7.110
ProcessBuffer gsexample2a 1.420
Store2Load gsexample2a 1.360
[Import thunk fread] gsexample2a 0.020
memset libc.so.6 0.010
Executing actions 100 % done [/shell]

 

Additionally, the VTune Amplifier XE provides the ability to disable displaying information for inlined functions in command line reports via the -inline-mode=off option, although inlined function data was collected, resulting in the same report output as when the code was compiled without the -inline-debug-info option.

[shell]# amplxe-cl -report hotspots -inline-mode=off Using result path `/opt/intel/vtune/samples/gsexample/r001hs' Executing actions 74 % Generating a report Function Module CPU Time -------------------- ----------- -------- main gsexample2a 9.890 [Import thunk fread] gsexample2a 0.020 memset libc.so.6 0.010 Executing actions 99 % done [/shell]


Finally, there is a known issue listed in the VTune Amplifier XE Release Notes:

Do not use -ipo option - it causes the inline debug information to switch off (200260765)

o If using the Intel® compiler to get performance data on inline functions, use the additional option "-inline-debug-info", but avoid using the -ipo option. Currently this option disables generating the inline debug information in the compiler.

For more complete information about compiler optimizations, see our Optimization Notice.