User Guide

Contents

View Stacks

Manage the
Intel® VTune™
Profiler
view to display call stacks for user and system functions and estimate an impact of each stack on the performance metrics.
Intel VTune
Profiler
provides call stack information in the
Call Stack
pane,
Bottom-up
pane,
Top-down Tree
, and
Caller/Callee
pane. You may use the following options to manage and analyze stacks in different views:

Change Stack Layout

Manage the stack representation in the grid (
Bottom-up
or
Top-down Tree
pane) by using the / stack layout toolbar button.
The button dynamically changes according to the selected layout. For example, if the chain layout is selected for the view, the button changes to show an option to choose a tree layout, and vice versa.
Chain layouts are typically more useful for the bottom-up view:
While tree layouts are more natural for the top-down view:
Chain layout in the
Top-down Tree
pane is possible only if there is no branching AND when all values of data columns are the same for the parent and for the child.

Navigate Between Stacks

To view stacks for the selected program unit, estimate stack contribution, and identify the most performance-critical stack, use the
Call Stack
pane and click the next/previous arrows.
To view information on several stacks or program units, Ctrl-click to select these stacks or program units in the
Bottom-up
or
Top-down Tree
pane. The
Call Stack
pane shows the highest contributing stack from all the selected stacks, with the contribution calculated based on the sum of all selected stacks. All the stacks related to the selection are added to the tab and you can navigate to them using the next/previous arrows.
Note that though each stack in the
Bottom-up
pane corresponds to a call stack provided in the
Call Stack
pane, the number of tree branches in the
Bottom-up
grid does not necessarily equal the number of stacks in the
Call Stack
pane. Since the stack in the
Bottom-up
pane is function-based and the stacks in the
Call Stack
pane are line-number-based, the number of stacks in these views may differ.
For example, in the screen capture below, the
Bottom-up
pane shows two stacks for the
grid_intersect
function whereas the
Call Stack
pane shows that 17 stacks exist.
If you navigate between stacks in the
Call Stack
pane using the navigation buttons, you see that the
grid_intersect
function was called several times from the
intersect_objects
function (in other words, had several call sites in this function). According to the
Call Stack
pane, the
intersect_objects
function in the first stack was called from
shader
function (line 139). In the second stack, it was called from the
trace
function (line 76). While for the
Call Stack
pane, which shows full stack information, these two stacks are different, for the
Bottom-up
view, these stacks are identical since they lead to the same
intersect_objects
function. So, the
Bottom-up
view merges stack information at the moment the calling sequences become identical and represents them as one stack. The
Bottom-up
view also aggregates stacks from different call sites into one and sums up their CPU time. This stack approach is useful if you are interested in the stack differences by top caller functions.

View Stacks per Metric

Use the drop-down menu in the
Call Stack
pane, to choose the stack type for the selected program unit.
For example, when a synchronization object is selected in the Threading analysis result, you can set the
Call Stack
pane to show the stacks where that object was created, signaled or waited for.

View System Functions in the Stack

To control whether you need the system functions show up in the stacks in the grid and
Call Stack
pane, use the Call Stack Mode menu provided on the filter toolbar.

View Source for a Stack Function

If you double-click a row in the
Call Stack
pane or click a function name provided as a hyperlink, the source file opens in the
Source
/
Assembly
window on the code that generated the item in the selected row.
For example, in a Threading analysis result, if you double-click the topmost item of the
Wait Time (Sync Object Creation)
stack, the related source file opens on the source line that created the corresponding synchronization object.
If the source code is not found, you can either locate it manually, or open the
Assembly
pane for this program unit.
If you select a system function, the
Source
/
Assembly
window opens the source file of the system function if it is available. If not, it shows the disassembly for the binary file containing this system function.

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804