This Intel® Media SDK tutorial sample illustrates the most simplistic way of implementing HW decode using system memory surfaces.
The basic goal of this example is to illustrate why asynchronous operation using video memory surfaces is necessary. While it is simpler to use system memory synchronously, as in this example, this introduces unnecessary bottlenecks:
- Surfaces must be copied from GPU to CPU. While this must happen in any case for decode which writes frames to disk, buffering is not as efficient in this scenario.
- For a single decode (or possibly even several) gaps in the processing pipeline cannot easily be filled. Since the GPU is not in constant use it may fall out of turbo mode.
Based on the above analysis we should be able to improve the performance of the workload by using video memory surfaces instead of system memory surfaces. The next tutorial sample will explore such scenario.
This tutorial sample is found in the tutorial samples package under the name "simple_2_decode". The code is extensively documented with inline comments detailing each step required to setup and execute the use case.