Pane: Top-down Tree

To access this pane: Click the Top-down Tree sub-tab in the result tab.

Default pane position in window

Use the Top-down Tree pane to explore the call sequence flow of the application and analyze the time spent in each program unit and on its callees.

The Top-down Tree pane is part of the Top-down Tree window. This window is synchronized with the Bottom-up window: when you select a program unit in one window and switch to another window, this unit is also highlighted.

Call Stack

The Call Stack column represents call sequences (stacks) detected during collection phase starting from the application root (usually, the main() function). The time value for a row is equal to the sum of all the nested items from that row. Use this data to see the impact of program units together with their callees. This type of investigation is known as a top-down analysis.

If the Intel® VTune™ Amplifier does not find debug information in binaries, it statically identifies function boundaries and assigns hotspot addresses to generated pseudo names func@address for such functions, for example:

Note

The call stacks are always available for the results of the user-mode sampling and tracing collection. They are also available for the results of the hardware event-based sampling collection, if you enabled the Collect stacks option during the analysis configuration. Otherwise, the Call Stack column for the event-based results shows a flat list of the functions.

Performance Metrics

By default, all program units are sorted in a descending order by the Data of Interest column providing the most performance-critical program units first. Each data column in the table corresponds to a performance metric. The list of performance metrics varies depending on the analysis type and selected viewpoint. In the Top-down Tree pane, the VTune Amplifier provides two types of metrics:

  • Self metrics show performance data collected within particular procedures and functions.

  • Total metrics show performance data collected within functions AND children (callees).

You may click a column header to sort the table by the corresponding metric.

Infotips

Infotips show up as pop-up windows when you mouse over a column header (metric) or a performance issue highlighted in pink.

Use This

To Do This

Metric infotip

See the metric description and a formula used for metric calculation (if available).

Issue infotip

See the description of the detected issue, tuning advice, and a formula used to calculate the threshold for this metric. If the metric value exceeds the threshold and the program unit is a hotspot, the VTune Amplifier highlights this value in pink as performance-critical.

Example

In this example, the rankomp1 function is the biggest hotspot of the application. It shows up in two stacks and does not have any callees. In the first stack, rankomp1 is the only callee of the rank function, while in the second stack, its caller function _vcomp::ParallelRegion::HandlerThreadFunc has three other callees: full_verifyomp2, full_verifyomp1, and mainomp1.

The Total time values (in percent by default) for the nested items under _vcomp::ParallelRegion::HandlerThreadFunc equal the time value for this row:

1% + 0.1% + 0.2% + 28.3% = 43.7%

For more complete information about compiler optimizations, see our Optimization Notice.