User Guide

Intel® VTune™ Profiler User Guide

ID 766319
Date 12/16/2022
Public

A newer version of this document is available. Customers should click here to go to the newest version.

Document Table of Contents

What's New in Intel® VTune™ Profiler

Intel® VTune™ Profiler 2023.0

Download this version of Intel® VTune™ Profiler from the product download page. This version contains the following additions:

  • GPU Accelerators
    • Stall Factor Information in GPU Profiling Results

      When you run the GPU Compute/Media Hotspots analysis to profile applications running on Intel® Data Center GPU Max Series (code named Ponte Vecchio) devices, you can now see the reasons for stalls in Xe Vector Engines (XVEs), formerly known as Execution Units (EUs). Use this information to better understand and resolve the stalls in your busiest computing tasks. For more information, see Analyze Xe Vector Engine (XVE) Stalls

    • Metric Groups for Multiple GPUs

      When you run the GPU Compute/Media Hotspots analysis to profile an application executing on multiple Intel GPUs, you can now see metric information grouped by Intel microarchitecture family. See metrics for every GPU architecture family in a new consolidated view. To learn more, see Analysis Results for Multiple GPUs.

    • Updated Metrics for Multiple GPUs

      GPU metric information in the Summary tab of the HPC Performance Characterization view have been enhanced to better represent data collected from multiple GPUs.

    • Support for Unified Shared Memory extension of OpenCL™ API

      When you use the GPU Offload analysis type to profile OpenCL™ applications, you can now profile the CPU-side stacks for GPU computing tasks and identify bottlenecks related to Unified Shared Memory (USM) for the OpenCL™ API .

    • Support for DirectML API

      This release also extends profiling support in the GPU Offload and GPU Compute/Media Hotspots analysis types for Microsoft® DirectX* applications to include support for the DirectML API.

  • Application Performance Snapshot
    • Updated Metrics for Multiple GPUs

      GPU metric information in the Application Performance Snapshot HTML reports have been enhanced to better represent data collected from multiple GPUs.

    • Histograms in Metric Tooltips

      The metric tooltips in Application Performance Snapshot HTML reports were enhanced with histograms that clearly visualize the distribution of metric values observed during analysis.

  • High Performance Computing
    • Better Hardware Observability

      This release adds the Platform Diagram to the Summary tab of the HPC Performance Characterization analysis result. The Platform Diagram reveals system topology, utilization metrics for physical cores, DRAM, and Intel® Ultra Path Interconnect (Intel® UPI) links.

      The Platform diagram is available for server platforms based on Intel® microarchitecture code named Skylake and newer architectures.

  • Input and Output Analysis
    • Intel® VT-d Observability

      Intel® Virtualization Technology for Directed I/O (Intel® VT-d) observability is introduced in the Input and Output analysis for server platforms based on 3rd Gen Intel® Xeon® Scalable processors (code named Ice Lake), the Intel Atom® P5900 Processor Family (code named Snow Ridge), and newer. New performance metrics reveal efficiency of hardware-driven DMA addresses remapping and penalties for sub-optimal Intel VT-d utilization.

  • VTune Profiler Server
    • New Command-Line Options for Convenience

      The vtune-backend binary that launches VTune Profiler Server now has new command-line options to make setup in certain environments more convenient. You can now specify a base URL that VTune Profiler Server will use as the basis for URL generation. Additionally, new options were added to suppress automatic help tours on startup and to provide/decline consent to collect usage information right from the command line.

      These new options can be especially useful if you are running VTune Profiler Server inside a container.

  • More Information on Windows*
    • Support for Debug Information For Inline Functions

      VTune Profiler is now capable of reading debugging information for inline functions from PDB symbol files on Windows* OS. VTune Profiler can now display names and source code for inline functions in your workload.

  • Managed Code Targets
    • .NET 6 Support

      This release introduces support for analyzing .NET 6 targets using User-Mode Sampling. You can analyze .NET 6 workloads in Launch Application and Attach to Process modes on both Windows* and Linux* hosts.

  • Language Support
    • Support for New Language Versions

      This release introduces support for Python 3.9.0 in the Hotspots Analysis type for Windows and Linux systems.

  • Platform Support
    • Support for Legacy Processors

      VTune Profiler now supports the following generations of processors in client and server platforms:

      • Server CPUs: Intel® Xeon® processor v3 and newer families.
      • Client CPUs: Intel® Core™ 4th generation processors and newer families.

      The 2023 version of VTune Profiler does not support processors older than the versions listed above. To analyze performance on older processors, use an older version of VTune Profiler.

  • Hardware Support
    • Support for New Architectures

      This release of Intel® VTune™ Profiler supports the following list of new Intel architectures and device families.

      • Fourth generation of Intel® Xeon® Scalable Processor (code named Sapphire Rapids)
      • 13th generation of Intel® Core™ Processor (code named Raptor Lake)
      • Intel® Data Center GPU Max Series (code named Ponte Vecchio)
      • First generation of Intel® Arc™ High-performance Discrete GPUs (code named Alchemist). This support includes:
        • Explicit support for SYCL, DirectX, Intel® Media SDK, OpenCL™, and OpenMP offload software technologies.
        • Support for multi-GPU systems. You can now profile all Intel GPU devices, including integrated and discrete GPUs.
        • Support for GPU Offload and GPU Hotspots analyses, including source level in-kernel profiling.
    NOTE:
    Families of Intel® Xe graphics products starting with Intel® Arc™ Alchemist (formerly DG2) and newer generations feature GPU architecture terminology that shifts from legacy terms. For more information on the terminology changes and to understand their mapping with legacy content, see GPU Architecture Terminology for Intel® Xe Graphics.
  • Operating System Support
    • New Host Operating Systems

      This release introduces support for these OS hosts:

      • Microsoft Windows* 11
      • Ubuntu* 21.10