HOW: Analysis Types

Intel® VTune™ Amplifier provides a set of pre-configured analysis types you may start with to address your particular performance optimization goals.

When you create a project, the VTune Amplifier opens the Configure Analysis window that prompts you to specify WHAT you want to analyze (an application, process, or a whole system), a system WHERE you plan to run the analysis, and select HOW you need to run the analysis.

Configure Analysis: Analysis Type

Clicking the Browse button in the HOW pane opens an analysis guide that helps you choose an analysis type applicable to your workload. Analysis types are distributed per groups:

Hotspots group:

  • Hotspots is best for analyzing call paths to find where your code is spending the most time and discover opportunities for tuning your algorithms. Applies to C/C++, Fortran, Java*, or Python* apps and more, including apps running in containers such as Docker*, LXC, and others. See Finding Hotspots tutorial: Linux | Windows.

  • Memory Consumption is best for analyzing memory consumption by your app, its distinct memory objects, and their allocation stacks. Applies to C/C++ or Python apps. This analysis is supported for Linux targets only.

Parallelism group:

  • Threading is best for visualizing thread parallelism on available cores, locating causes of low concurrency, and identifying serial bottlenecks in your code. Applies to C/C++, Fortran, or Java* apps and more.

  • HPC Performance Characterization is best for understanding how your compute-intensive OpenMP* or MPI app is using the CPU, memory, and floating point unit (FPU) resources. Applies to C/C++ or Fortran apps and more. See Analyzing an OpenMP* and MPI Application tutorial: Linux.

Microarchitecture group:

  • Microarchitecture Exploration (formerly known as General Exploration) is best for identifying the CPU pipeline stage (front-end, back-end, etc.) and hardware units responsible for your hardware bottlenecks. Applies to C/C++, Fortran, or Java* apps and more, including apps in containers such as Docker* or LXC.

  • Memory Access is best for memory-bound apps to determine which level of the memory hierarchy is impacting your performance by reviewing CPU cache and main memory usage, including possible NUMA issues. Applies to C/C++, Fortran, or Java* apps and more.

In addition, the VTune Amplifier offers Platform analysis group that is helpful in specific use cases, such as GPU, I/O, IRQ analysis and so on:

  • CPU/GPU Concurrency enables you to explore code execution on the various CPU and GPU cores on your platform, correlate CPU and GPU activity and identify whether your application is GPU or CPU bound.

  • GPU Compute/Media Hotspots is targeted for applications using a Graphics Processing Unit (GPU) for rendering, video processing, and computations. If you ran the CPU/GPU Concurrency analysis and identified that your application is GPU-bound, use the GPU Compute/Media Hotspots analysis to go deeper and identify the most time-consuming GPU computing tasks and analyze performance per GPU hardware metrics.

  • GPU In-Kernel Profiling is targeted for GPU-bound applications and helps analyze GPU kernel execution per code line and identify performance issues caused by memory latency or inefficient kernel algorithms. This analysis type incurs a higher performance overhead.

  • System Overview is a driverless event-based sampling analysis that monitors general behavior of your target Linux* or Android* system and correlates power and performance metrics with the interrupt request (IRQ) handling.

  • Input and Output analysis monitors utilization of the IO subsystems, CPU and processor buses.

  • CPU/FPGA Interaction (preview) analysis explores FPGA utilization for each FPGA accelerator and identifies the most time-consuming FPGA computing tasks.

  • GPU Rendering (preview) analysis is targeted to estimate the CPU/GPU utilization of your code running on the Xen virtualization platform.

  • Platform Profiler (preview) analysis collects data on a deployed system running a full load over an extended period of time with insights into overall system configuration, performance, and behavior. The collection is run on a command prompt outside of VTune Amplifier and results are viewed in a web browser.

Note

A PREVIEW FEATURE may or may not appear in a future production release. It is available for your use in the hopes that you will provide feedback on its usefulness and help determine its future. Data collected with a preview feature is not guaranteed to be backward compatible with future releases. Please send your feedback to parallel.studio.support@intel.com.

As an alternative, advanced users may consider creating a custom analysis using the data collectors provided by the VTune Amplifier, or combining a VTune Amplifier's collector and any other custom collector.

For more complete information about compiler optimizations, see our Optimization Notice.
Select sticky button color: 
Orange (only for download buttons)