In the Intel® Media SDK tutorial sections covering encode, decode and VPP we addressed the system memory performance issue by utilizing D3D surfaces. We can take the same approach for the transcode workload but for better flexibility, Intel Media SDK offers an alternate solution for transcode workloads, still utilizing suitable surface type. So, for this transcode sample we will use “opaque memory” surfaces, which implies that D3D surfaces will be used when the HW accelerated codec is used and system memory surfaces are used when the SW codec is used.
Via the Intel® GPA “Media Performance” dialog we observe that the overall GPU utilization has slightly improved, ~85%, vs. the previous sample, "simple_5_transcode", that used system memory surfaces. For tutorial snapshot benchmarks comparing all workloads analyzed with Intel GPA, navigate to this page.
Let’s analyze the Intel GPA trace for this workload to explore what is going on
- Compared to the previous workload trace that used system memory surfaces, the only difference in the trace is the elimination of two surface copies per frame. As you would expect this leads to a lower CPU utilization. Overall, the GPU is more active leading to a greater average GPU frequency, impacting the workload performance.
- One thing to note is the overlap of the decode operation in track “GPU DECODE” with the motion estimation encode operation in the “04 GPU ENCODE” track. It may seem that the encode task must wait for decode to complete, but in this case the encoder and decoder are actually working on different frames.
We achieved good performance improvements using the above improved transcode pipeline, but like the encode workload we explored earlier, to fully utilize the GPU we must enhance the transcode pipeline further by introducing concurrent asynchronous tasks.
This tutorial sample is found in the tutorial samples package under the name "simple_5_transcode_opaque". The code is extensively documented with inline comments detailing each step required to setup and execute the use case.