User Guide

  • 2020
  • 06/18/2020
  • Public Content
Contents

User-Mode Sampling and Tracing Collection

When profiling application execution, the
Intel® VTune™
Profiler
takes snapshots of how that application utilizes the processors in the system. A thread is considered active at a specific moment if it is ready to execute or is executing (not blocking). The snapshots of the number of running threads at the moment provide a hint to the degree of parallelism of the application as well as how this application utilizes processor resources.
VTune
Profiler
classifies utilization into the ranges: Idle, Poor, Ok, and Ideal.
The user-mode sampling and tracing collector interrupts a process, collects the value of all active instruction addresses and captures a calling sequence for each of these samples. Sampled instruction pointers along with their calling sequences (stacks) are stored in data collection files. Statistically collected IP samples with calling sequences enable the viewer to display a call graph or/and the most time-consuming paths. Use this data to understand the control flow for statistically important code sections.
On Linux* the user-mode sampling and tracing collector embeds an agent library into the profiled application. The agent sets up the OS timer for each thread in the application. Upon timer expiration, the application receives the SIGPROF or another runtime signal that is handled by the collector.
Average overhead of the user-mode sampling and tracing collector is about 5% when sampling is using the default interval of 10ms.
VTune
Profiler
uses the user-mode sampling and tracing collector to collect data for the following analysis types:
You can also create a custom analysis type based on the user-mode sampling and tracing collection.

Collecting Stack Data

When collecting data, the
VTune
Profiler
analyzes no more than one stack per configured interval. It unwinds stacks each 10 milliseconds of thread execution. But the
VTune
Profiler
may decide to skip or emulate stack unwinding for performance reasons. In this case, when processing the collected data during finalization, the
VTune
Profiler
tries to find matching stacks in the history for events without stacks.
This approach reduces stack unwinding overhead but may provide incorrect stacks due to wrong matches. In such cases, the
VTune
Profiler
displays pseudo nodes in the bottom-up/top-down trees marked as [Guessed frame(s)], and [Skipped frame(s)]. See Troubleshooting to learn how to overcome these problems.
VTune
Profiler
may also display [Unknown frame(s)] nodes if it could not locate symbol files for system or application modules when unwinding the stack. See Resolving Unknown Frame(s) for more details.

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804