Hotspots View
Identify program units that took the most CPU time (
hotspots
). This viewpoint is available for all analysis results.
To interpret the performance data provided in the Hotspots viewpoint, you may follow the steps below:
Define a Performance Baseline
Start with exploring the
Summary window that provides general information on your application execution. Note that the Elapsed time, which includes application time from start to termination, differs from the application CPU time, which is a sum of the active (does not include waiting time) processor time for all the threads that run the application.
Use the Elapsed time value provided in the
Summary
window as a baseline for comparison of versions before and after optimization. Note that while tuning the application, the Elapsed time tends to decrease whereas the CPU time may increase with adding more threads to the application.
If you ran the Hotspots analysis in the
hardware event-based sampling
mode, the analysis metrics in the
Summary
window display the
Microarchitecture Usage metric that helps you estimate the code efficiency on the current hardware platform:

If this metric value is flagged as critical, consider running the
Microarchitecture Exploration
analysis that dives deeper into hardware metrics.
Identify the Hottest Function
Start with the
Top Hotspots
section in the
Summary
window to get a list of the most time-consuming functions. Click such a hotspot function to explore its call flow and other related metrics in the
Bottom-up view.
By default, the data in the
Bottom-up
view is sorted in the descending order by the CPU Time providing the most time-consuming functions first. Focus on the functions with the largest CPU time. These are your candidates for optimization.
Expand the
CPU Time
column to get more details on how effectively the CPU time was used:

Focus your tuning efforts on the program units with the largest
) or
) CPU utilization state and shorten the Poor and Over CPU utilization values.
Poor
value. This means that during the execution of these program units your application underutilized the CPU time. The overall goal of optimization is to achieve
Ideal
(green

OK
(orange

Identify Algorithm Issues
You can identify issues with the calling sequences in your application and improve performance by revising the way functions are called. The following methods to locate potential issues are available:
- Top-down Tree pane: Analyze the Total and Self time data for callers and callees of the hotspot function to understand whether this time can be optimized.
- Call Stack pane: Identify the highest contributing stack for the program unit(s) selected in theBottom-uporTop-down Treepanes. Use the navigation buttons
to see the different stacks that called the selected program unit(s). The contribution bar shows the contribution of the currently visible stack to the overall time spent by the selected program unit(s). You can also use the drop-down list in the
Call Stackpane to view data for different types of stacks.
Stack data is available by default for the
user-mode sampling
mode. To have this data for the
hardware event-based sampling
mode, you need to enable the
Collect stacks
option in the Hotspots analysis configuration.
Analyze Source
Double-click the hottest function to view its related source code in the Source/Assembly window. You can open the code editor directly from the
Intel® VTune™
and edit your code (for example, minimizing the number of calls to the hotspot function).
Profiler
What's Next
If you ran the analysis with the default
Show additional performance insights
option, the
Summary
view will include the
Insights
section that provides additional metrics for your target such as efficiency of the hardware usage and vectorization. This information helps you identify potential next steps for your performance analysis and understand where you could focus your optimization efforts.
