Intel Media SDK Tutorial - Analysis using Intel GPA tool

Intel® Graphics Performance Analyzers (Intel® GPA) are very useful for identifying media pipeline inefficiencies and opportunities for optimizations. We will be using Intel GPA to analyze some of the Intel® Media SDK tutorial use cases. Please refer to each sample description for details.

Intel GPA can downloaded for free via the following link: http://software.intel.com/en-us/vcsource/tools/intel-gpa

Throughout this tutorial we use Intel GPA 2012 release 5. Note that future revisions of Intel GPA will introduce new features that may cause the Intel GPA specific parts of the tutorial to deviate.

With Intel GPA you can get in-depth traces of media workloads, capturing for instance operations such as fixed function HW processing (denoted as MFX from now on) or execution unit (denoted as EU from now on) processing. Intel GPA also have the capability to display real-time GPU load via the Media Performance dialog.

The Intel GPA suite is started by launching the "Intel GPA Monitor" application, then displayed as a task bar icon. Right-clicking the task bar icon brings up a menu with a few options. For the purpose of media workload analysis, we will only focus on the following GPA menu items:

  • Media Performance
    • By selecting this menu item the user can explore the GPU load in real-time.The EU and MFX load are displayed, including HW encode/decode/VPP load contribution.
    • The GPU load benchmarks presented in "Workload Analysis and Benchmarking" section of the tutorial were captured using the Media Performance dialog.


  • Profiles and Preferences
    • To enable capturing detailed GPU media component traces, Intel GPA must first be configured as follows:
      • Select the “Profiles” menu item, then click the "Tracing" tab. Enable the following check boxes:
        • Media Performance Data
        • Hardware Context Data
        • Capture Application Startup
      •  Press the “OK” button to exit the "Profiles" dialog window
         
      • Select the “Preferences” menu item. De-select the following check boxes:
        • Auto-detect launched applications
        • Disable Tracing
      • Press “OK” to exit the "Preferences" dialog window 
         
  • Analyze Application
    • To capture detailed workload traces select "Analyze Application" from the Intel GPA task bar menu.
    • To captures trace for a specific workload, enter the executable path, name and arguments in the “Command Line” dialog item. Make sure that “Working Folder” is set correctly.
    • To start capturing traces for the specified workload press the “Run” button
    • When finished, the trace file can be found in your “Documents” folder in the GPA workloads sub directory. For instance: "C:\Users\<user name>\Documents\GPA_2012_R5\"
    • To open the trace just double-click on the generated trace file.
       
      Click here to expand image in separate window

To enable extended Intel GPA tracing granularity, we recommend adding the following item to the Windows registry using "regedit":

  • Create the “MediaTraceLevel” entry (REG_DWORD) in “HKEY_CURRENT_USER\Software\Intel\GPA\<GPA version>\”
  • Set the value of the new item to "9"
  • Restart Intel GPA after applying the registry changes.

Click here to expand image in separate window

Before analyzing any of the Intel Media SDK tutorial workloads, let’s examine what information is presented in traces captured by Intel GPA. Captured events are categorized into tracks, separated based on the type of context and location of execution. The table below lists the tracks of relevance for media workload analysis:

Track Name

Track Description

GPU MFX Queue

Represents the GPU fixed function HW (MFX) unit queue. Gaps in this track indicate GPU idle time, which is something we would like to avoid for best throughput/performance.

GPU EU Queue

Represents the GPU execution unit (EU) queue. Gaps in this track indicate GPU idle time, which is something we would like to avoid for best throughput/performance. Note that for some workloads the EU is not used since all of the processing occurs on the MFX unit.

MSDK app

Represents the Intel Media SDK API calls made from the application.

GPU DECODE

Represents the GPU performing fixed function decode operation in the MFX unit.

Application track (such as “simple_decode.exe”)

Represents low level functions called from the Intel Media SDK DLL layer. Threads are not affinitized so the calls may migrate between the Intel Media SDK layer threads (one thread per logical processor core).

04 GPU ENCODE

Represents all GPU EU encoder operations (for instance, motion estimation).

06 GPU ENCODE

Represents all of the GPU MFX encoding operations.

GPU VPP

Represents the GPU performing VPP operation on the GPU EU.

 

Observe the following: Do not keep the Media Performance Analyzer window open while collecting traces. It will prevent Intel GPA from capturing some GPU tracing details.

For an other view on how to use Intel GPA to analyze Intel Media SDK workloads please refer to the following white paper: http://software.intel.com/en-us/articles/using-intel-graphics-performance-analyzer-gpa-to-analyze-intel-media-software-development

Click here to go back to Tutorial front page.

附件尺寸
下载 registry.jpg97.58 KB
下载 analyze.jpg107.1 KB
下载 prefs.jpg16.58 KB
下载 profiles.jpg36.99 KB
下载 sample-decode1.jpg71.76 KB
如需更全面地了解编译器优化,请参阅优化注意事项