User Guide

  • 2020.3
  • 07/10/2020
  • Public Content
Contents

Performance Analysis Workflow

When profiling desktop API graphics applications with Intel® Graphics Performance Analyzers (Intel® GPA), you can:
  • Collect and display hardware and software metrics data from your application in real time and conduct Microsoft* Direct3D pipeline experiments using System Analyzer or System Analyzer HUD. This can help you understand the high-level performance profile of your graphics application, determine whether your application is CPU-bound or GPU-bound, and quickly isolate graphics bottlenecks.
  • Create stream files for further analysis with Graphics Frame Analyzer (for DirectX 11 and Vulkan).
  • Create frame capture files, which contain all Microsoft DirectX* context used to render the selected 3D frame, as well as GPU metrics per draw call/region.
  • Collect real-time trace data during the application run for further analysis with Graphics Trace Analyzer.
  • Understand the performance of your application at the frame level, render target level, and draw call level with Graphics Frame Analyzer:
    • Experiment with individual events (that is, any call that potentially renders one or more pixels to the frame buffer or does another GPU work) and various settings for the entire rendering pipeline.
    • Modify states and shader code to see whether it is possible to improve render time.
    • Determine whether texture bandwidth is a performance bottleneck.
    • Minimize overdraw by analyzing pixel history
    • Conduct "what if" optimization experiments without recompiling or rebuilding your application.
  • Visualize the execution profile of the various tasks in your code over time using Graphics Trace Analyzer:
    • Explore GPU usage and analyze a software queue for GPU engines at each moment of time.
    • Analyze graphics API calls (draw calls, buffer locks, resource updates, presents).
    • Correlate CPU and GPU activity and identify whether your application is GPU or CPU bound.
    • Identify GPU and CPU application frame rate and how it depends on vertical synchronization.
    • Explore the performance of your application per selected GPU metrics over time.
    • Analyze GPU usage per DMA packet on a software queue.
No code modifications or special libraries are needed to determine whether your game is CPU or GPU bound and to figure out what is happening within a specific frame of your game.
Additionally, you can instrument your application with the Instrumentation and Tracing Technology (ITT) API to visualize the execution profile of the various tasks in your code over time in the Graphics Trace Analyzer: just add calls within your game code to designate logical tasks in your game. You can also use System Analyzer to profile non-DirectX* applications instrumented by the ITT API. In this case, the
Metric Tree Control
pane shows only CPU metrics and only the
Capture Trace
button is enabled.
For detailed information regarding the ITT API, see the About Instrumentation and Tracing Technology (ITT) APIs topic.
NOTE
Intel® GPA is not supported on 32-bit systems.
Graphics Monitor is the main hub tool for application analysis. You must have Graphics Monitor running on your target system to start an application for analysis.
  1. Launch the Graphics Monitor to start the application and configure analysis settings.
  2. Connect your Intel® GPA tools installed on the host system to the target system using the target system IP address. If you are analyzing an application locally, you can specify “This Machine” or “localhost” instead.
    NOTE
    When you connect to a target system for the first time, Graphics Monitor asks you to authorize the connection. Click
    Accept Once
    to allow the connection for the current session only, or
    Accept Always
    to add the device IP address to the list of authorized devices.
  3. Perform system analysis with System Analyzer or System Analyzer HUD.
  4. Capture stream, frame or trace files for further in-depth analysis:
    • Perform stream analysis to detect frames with potential performance bottlenecks .
    • Perform frame analysis to understand the performance impact of specific draw calls at different stages of the rendering pipeline.
    • Perform platform analysis to analyze your application performance with respect to CPU and GPU utilization.
      NOTE
      By default, tracing is disabled. To enable platform analysis, in the
      Trace
      tab of the
      Graphics Monitor
      Options
      screen, enable the
      T
      racing
      toggle button.
  5. Change your game code and re-run Intel® GPA to verify that your changes achieve the expected performance improvements.
 

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804