In this Intel® Media SDK tutorial transcode sample we introduce asynchronous pipeline behavior using the same approach as we did in the “simple_3_encode_d3d_async” sample.
Using the Intel® GPA “Media Performance” dialog we observe that the overall GPU utilization has improved significantly and that the GPU is now almost completely utilized, ~97%. For tutorial snapshot benchmarks comparing all workloads analyzed with Intel GPA, navigate to this page.
Let’s look at the Intel GPA workload trace to explore the new transcode pipeline behavior
- First, we can confirm that the GPU is much better utilized, noting the lack of gaps in the “GPU MFX Queue” and “GPU EU Queue” tracks.
- Utilization and performance also benefit from the concurrent execution of tasks in encoder EU “04 GPU ENCODE” and MFX “06 GPU ENCODE” tracks.
- Since the GPU is highly utilized, the overall performance is also improved by the fact that the GPU is consistently residing in a high frequency state (due to Intel® Turbo Boost Technology1).
This concludes the Intel GPA performance analysis of the Intel Media SDK tutorial workloads. In the remaining tutorial samples we will only explain the overall purpose and behavior, and will not include performance analysis using Intel GPA.
This tutorial sample is found in the the tutorial samples package under the name "simple_5_transcode_opaque_async". The code is extensively documented with inline comments detailing each step required to setup and execute the use case.
1 Requires a system with Intel® Turbo Boost Technology. Intel Turbo Boost Technology and Intel Turbo Boost Technology 2.0 are only available on select Intel® processors. Consult your PC manufacturer. Performance varies depending on hardware, software, and system configuration. For more information, visit http://www.intel.com/go/turbo