This Intel® Media SDK tutorial sample operates in the exact same way as the previous tutorial sample, "simple_4_vpp_resize_procamp" except that it is using D3D memory surfaces.
Like the "simple_2_decode" tutorial sample this sample supports both Microsoft DirectX* 9 and DirectX* 11. For more details on this topic please refer to "simple_2_decode" sample description.
The Intel® GPA “Media Performance” dialog now shows a better performance snapshot, with an overall GPU utilization of ~40%. For tutorial snapshot benchmarks comparing all workloads analyzed with Intel GPA, navigate to this page.
To explore the reasons for the improved GPU utilization, let’s study the Intel GPA trace.
- As with the previous workload, the RunFrameVPPAsync() call leads to the VPP Submit operation on the “simple_vpp_d3d.exe” track. However, in this case the task is submitted to the GPU, via DXVA2_Execute, almost immediately since the input surface already resides on a D3D memory surface. This improved workload behavior is the sole reason for the improved performance and GPU utilization. After the DXVA2_Execute call, the GPU starts processing the frame as seen in the “GPU VPP” track.
- Like the encode workloads discussed earlier (such as “simple_3_encode_d3d”), Intel Media SDK uses a polling mechanism, via VPP Query, to determine if the GPU has fully processed the frame.The GPU is queried every 1ms until the frame is ready. Since the current workload is synchronous the SyncOperation() call will wait until the next VPP Query which results in quite a long segment of inactivity (see “A” in the trace above).
We already demonstrated how to achieve greater performance by making the Intel Media SDK work in an asynchronous fashion, as in the “simple_3_encode_d3d_async” workload. The same approach can be used for VPP processing so we will not explore this case further. It suffices to say that GPU utilization and performance can be improved significantly by applying the task concurrency approach for VPP.
That concludes the analysis of Intel Media SDK decode, encode and VPP workloads. The next tutorial sections explore the behavior of workloads combining several Intel Media SDK components.
This tutorial sample is found in the tutorial samples package under the name "simple_4_vpp_resize_procamp_d3d". The code is extensively documented with inline comments detailing each step required to setup and execute the use case.