User Guide

  • 2020
  • 06/18/2020
  • Public Content
Contents

HOW: Analysis Types

Intel® VTune™
Profiler
provides a set of pre-configured analysis types you may start with to address your particular performance optimization goals.
When you create a project, the
VTune
Profiler
opens the
Configure Analysis
window that prompts you to specify WHAT you want to analyze (an application, process, or a whole system), a system WHERE you plan to run the analysis, and select HOW you need to run the analysis.
Configure Analysis: Analysis Type
Clicking the Browse button in the
HOW
pane opens an analysis guide that helps you choose an analysis type applicable to your workload. Analysis types are distributed per groups:
Hotspots
group:
  • Hotspots is best for analyzing call paths to find where your code is spending the most time and discover opportunities for tuning your algorithms. Applies to C/C++, Fortran, Java*, or Python* apps and more, including apps running in containers such as Docker*, LXC, and others. See
    Finding Hotspots tutorial
    : Linux | Windows .
  • Memory Consumption is best for analyzing memory consumption by your app, its distinct memory objects, and their allocation stacks. Applies to C/C++ or Python apps. This analysis is supported for Linux targets only.
Parallelism
group:
  • Threading is best for visualizing thread parallelism on available cores, locating causes of low concurrency, and identifying serial bottlenecks in your code. Applies to C/C++, Fortran, or Java* apps and more.
  • HPC Performance Characterization is best for understanding how your compute-intensive OpenMP* or MPI app is using the CPU, memory, and floating point unit (FPU) resources. Applies to C/C++ or Fortran apps and more. See
    Analyzing an OpenMP* and MPI Application tutorial
    : Linux .
Microarchitecture
group:
  • Microarchitecture Exploration (formerly known as General Exploration) is best for identifying the CPU pipeline stage (front-end, back-end, etc.) and hardware units responsible for your hardware bottlenecks. Applies to C/C++, Fortran, or Java* apps and more, including apps in containers such as Docker* or LXC.
  • Memory Access is best for memory-bound apps to determine which level of the memory hierarchy is impacting your performance by reviewing CPU cache and main memory usage, including possible NUMA issues. Applies to C/C++, Fortran, or Java* apps and more.
In addition, the
VTune
Profiler
offers
Platform
analysis group that is helpful in specific use cases, such as GPU, I/O, IRQ analysis and so on:
  • System Overview is a driverless event-based sampling analysis that monitors general behavior of your target Linux* or Android* system and correlates power and performance metrics with the interrupt request (IRQ) handling.
  • GPU Offload (preview) is targeted for applications using a Graphics Processing Unit (GPU) for rendering, video processing, and computations. It helps you identify whether your application is CPU or GPU bound.
  • GPU Compute/Media Hotspots (preview) is targeted for GPU-bound applications and helps analyze GPU kernel execution per code line and identify performance issues caused by memory latency or inefficient kernel algorithms.
  • Input and Output analysis monitors utilization of the IO subsystems, CPU and processor buses.
  • Throttling analysis is useful to identify performance issues that result from the CPU operating at temperatures above thermal and power limits.
  • CPU/FPGA Interaction (preview) analysis explores FPGA utilization for each FPGA accelerator and identifies the most time-consuming FPGA computing tasks.
  • GPU Rendering (preview) analysis is targeted to estimate the CPU/GPU utilization of your code running on the Xen virtualization platform.
  • Platform Profiler analysis collects data on a deployed system running a full load over an extended period of time with insights into overall system configuration, performance, and behavior. The collection is run on a command prompt outside of
    VTune
    Profiler
    and results are viewed in a web browser.
A
PREVIEW FEATURE
may or may not appear in a future production release. It is available for your use in the hopes that you will provide feedback on its usefulness and help determine its future. Data collected with a preview feature is not guaranteed to be backward compatible with future releases. Please send your feedback to parallel.studio.support@intel.com.
As an alternative, advanced users may consider creating a custom analysis using the data collectors provided by the
VTune
Profiler
, or combining a
VTune
Profiler
's collector and any other custom collector .

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804