Configure the Intel® VTune™ Amplifier data view to display the performance data per inline functions for applications in the Release configuration.


This option is supported if you compile your code using:

  • Linux*: GCC* compiler 4.1 (or higher)

  • Linux* and Windows*: Intel® Compiler (12.1.333 or higher), with the -debug inline-debug-info option (Linux)//debug:inline-debug-info option (Windows) enabled

View Inline Functions

To view data on inline functions, in the analysis result window, set the Inline Mode filer bar option to Show inline functions. VTune Amplifier will display inline functions (virtual frames) as regular functions.

To disable displaying inline functions, select Hide inline functions.

Example 1: Inline Mode for Hotspots Analysis

In this example, you enable the Show inline functions option for the Hotspots analysis. This mode shows a full stack for the GetModelParams inline function:

Show inline functions

You can select the Source Function/Function/Call Stack level in the Grouping menu to view all instances of the inline function in one row.

If you double-click the GetModelParams inline function, you can identify the code line that took the most CPU time and analyze the corresponding assembly code:

Example 2: Inline Mode for Hotspots analysis Disabled

When you select the Hide inline functions option on the filter bar for the same sample, the VTune Amplifier does not show the GetModelParams function in the Bottom-up view:

But if you double-click the main function entry and explore the source, you can see that all CPU time is attributed to the code line where the GetModelParams inline function is called:

Example 3: Inline Mode for GPU In-kernel Profiling

By default, the Inline Mode for GPU In-kernel Profiling analysis view is disabled. In this example, 100% of GPU Cycles are attributed to the GPU_FFT_Global function:

Double-clicking the GPU_FFT_Global source function opens the source view positioned on the code line invoking this function with 95.3% of Estimated GPU Cycles attributed to it:

But if you select the Computing Task/Function/Call Stack or Computing Task/Source Function/Call Stack grouping level and enable the Inline Mode for this view, you see that the GPU_FFT_Global function took only 4.7% of the GPU Cycles, while four inline functions took the rest of cycles:

Double-click the hottest GPU_FftIteration function to analyze its source and assembly code:

See Also

Para obtener información más completa sobre las optimizaciones del compilador, consulte nuestro Aviso de optimización.
Seleccione el color del botón adhesivo: 
Orange (only for download buttons)