This Intel® Media SDK tutorial sample operates in the exact same way as the previous tutorial sample, "simple_4_vpp_resize_denoise" except that it is using video memory surfaces.
For more details on using video memory please refer to "simple_2_decode" sample description.
As with the previous workload, the RunFrameVPPAsync() call leads to the VPP Submit operation. However, in this case the task is submitted to the GPU almost immediately since the input surface already resides on a video memory surface.
Like the encode workloads discussed earlier (such as “simple_3_encode_vmem”), Intel Media SDK uses a polling mechanism, via VPP Query, to determine if the GPU has fully processed the frame.The GPU is queried every 1ms until the frame is ready. Since the current workload is synchronous the SyncOperation() call will wait until the next VPP Query which can cause large gaps in GPU execution.
We already demonstrated how to achieve greater performance by making the Intel Media SDK work in an asynchronous fashion, as in the “simple_3_encode_vmem_async” workload. The same approach can be used for VPP processing so we will not explore this case further. It suffices to say that GPU utilization and performance can be improved significantly by applying the task concurrency approach for VPP.
This concludes the analysis of Intel Media SDK decode, encode and VPP workloads. The next tutorial sections explore the behavior of workloads combining several Intel Media SDK components.
This tutorial sample is found in the tutorial samples package under the name "simple_4_vpp_resize_denoise_vmem". The code is extensively documented with inline comments detailing each step required to setup and execute the use case.