This video shows how to perform a trace analysis for a CPU-bound application with the Platform Analyzer in Intel® Graphics Performance Analyzers (Intel® GPA). You can analyze overall GPU use per GPU engine at each moment of time, analyze DMA packet submission on a software queue, correlate CPU and GPU activity per thread, and more.
This video will show you how to get started with trace analysis of your DirectX* application using Trace Analyzer.
To start the analysis, launch Trace Analyzer on your system and select the captured trace file.
Once the trace file has loaded, you can see a timeline populated showing a large amount of data about the platform.
Trace analyzer shows packets of work on the CPU and GPU. To start, explore the different sections of trace analyzer.
There’s the CPU Context Queue, Captured Metrics, GPU Adapter Queue, Threads and API Calls, and the CPU Submission queue.
Use the hotkeys W, A, S, D to zoom-in, move left, zoom-out, and move right along the data.
Looking at the GPU adapter queue and the CPU context queue, we can determine if an application is purely GPU or CPU bound.
If there is gapping in the GPU engine and your CPU is at max utilization, then you are most likely CPU bound. On the other hand, if GPU is at full utilization, with idling in the CPU, the GPU is your culprit.
If your application is purely CPU bound and a deeper dive is required, consider using Intel VTune Amplifier XE to help find the source of slowdowns in your code.
This video covers only a brief portion of Trace Analyzer.
Press F1 anywhere within Trace Analyzer to read more about the available features and the documentation.
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice revision #20110804