I am trying to understand how I should manage the output surface pool when running a decoding session. I have a HW API 1.1 /SW API 1.4 sandy bridge processor. In my application I'd like to decode frames into system memory and post-process the decoded frames. I have no intention to provide external buffer/frame allocator. So I assume the SDK library will allocate and manage its own DPB buffers no matter which API (HW/SW) is used? I do not care about performance penalty when copying from video memory to system memory when HW API is called (I assume).
Based on the information above, my question is should I call MFXVideoDECODE_QueryIOSurf() to obtain the minimum buffers and allocate surfaces accordingly? Since the SDK has its own DPB buffers now, will it lock my externally provided surface for a longer period if it happens to store an IDR frame?
I am running the sample_decode application and I do not fully understand the behavior below:
1. HW API: I can see that the external surface is only locked between the MFXVideoDECODE_DecodeFrameAsync() call and MFXVideoCORE_SyncOperation() call. My take here is that locking is only for filling purpose. During the whole session, only one external surface is always suppiled. This seems confirm my undestanding above.
2. SW API: I can see that the external surfaces are locked in turn and more than one surfaces are in lock. This seems contradict my understanding above.
So is there any difference calling HW and SW API? Is it that external surface pool are used as the DPB buffer pool when calling SW API?
Another question is about sync depth. According one article in the forum on video conferencing, it is recommended to set AsyncDepth to 1 and feed complete frames. Ths is understandable for conference video since B frames are not used.
But what about other main-profile videos where frame reordering definitely happens? Can I still set AsyncDepth to 1 when I want to pull decoded frames ASAP? If I could, what if I feed incomplete frames at a time? Will the output be corrupted? I actually have set AsyncDepth to 1 and fed incomplete frames, and the output seems okay when calling HW API.
I have encountered other API calling failures, such as syncing operation returns -1 (calling HW API) and failure to lock memory error (calling SW API), in my multithreading application, but I hope your clarification will show that those issues are caused by my incorrect management of the surface pool.