Media SDK decoder consumes two frames in first decode call

Media SDK decoder consumes two frames in first decode call

Hi, I have started implementing use of the Quick Sync video decoder through the Media SDK and stumbled upon something that for me is a problem.

 

The DecodeFrameAsync call consumes two frames the first time it is called after instantiation, when I’m feeding it more than one frame of encoded data at once. In this particular case I’m feeding it two complete GOPs of H.264 data, and in all the subsequent calls it only consumes one frame. I have only tested this with the software implementation.

 

Our current decoder solution never consumes more than one frame in a decoder call, even when feed a large chunk of data with many frames in it (and then like the Media SDK signals how much of the encoded data was consumed). Various applications uses this decoder with code based on this assumption/interface, and I would thus like to keep this as it is.

 

Is there a way for me to configure the Media SDK decoder to never consume more than one frame in one call to DecodeFrameAsync?

11 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.
Tony Pabon (Intel)'s picture

Hi,

The application can set mfxVideoParam::AsyncDepth=1 to disable any decoder buffering of output frames, which is aimed to improve the transcoding throughput. With AsyncDepth=1, the application must synchronize after the decoding or transcoding operation of each frame.

-Tony

Hi Tony, thank you for the reply. I do however already set AsyncDepth = 1, but regardless of that the decoder still consumes two complete frames from the input buffer given in the very first call to DecodeFrameAsync.

If there is no way of configuring the maximum number of complete frames the decoder is allowed to consume in one call to DecodeFrameAsync, is there alternatively a way detecting the number of frames that was consumed? (Without having to empty the decoder with null input, and waiting till no frame comes out)

 

Please, I would very much like to know where I stand on this issue.

Tony Pabon (Intel)'s picture

Hi,

Sorry I missed this response.

I am not seeing the behavior you describe.  Assuming first frame is I frame and ASyncDepth is set to 1, and then SyncOperation is called, the second frame has not been decoded and is not decoded until a second call to “DecodeFrameAsync” is made.

Example of this would be seen using the Decoder sample application and test_stream.264 from http://software.intel.com/en-us/media-solutions-portal and using command line of

sample_decode.exe h264 -i test_stream.264 -r

I might not be understanding what you mean by “consumes two complete frames”.  A pointer to the bitstream is supplied to DecodeFrameAsync, and the decoder examines enough of the bitstream to produce the requested frame (or reports ‘MFX_ERR_MORE_DATA’ to indicate that more data is needed).

-Tony

Jumping in at the end so if this doesn't fit, oh well.

From my experience, if I send a frame and only a frame, the decoder wants more data.  This implies the encoder is peeking ahead seeing what comes next.  If there is nothing else there it wants more data.  I forget what I did, maybe include an SEI frame after.  It sounds awfully hackish but this is blackbox no-source-available stuff so you do what you have to do.  I won't even confirm that is what I did, or needed to do, or maybe it was some other problem, solved some other way.  I know I did have a problem getting a frame out when expected.

 

Replace the word "encoder" with decoder above.

Tony Pabon (Intel)'s picture

Can you see the unmodified sample of sample_decode and "test_stream.264" work?  DecodeFrameASync and SyncOperation are called once, first frame render & displayed, before 2nd frame is decoded.

Or are you saying you have issue that when "only" a small part of bitstream is supplied?  Original question here was mentioning "feeding it two complete GOPs".  If the decoder asks for more data, there is a reason more data is required.

-Tony

If the Q above was to betlet, then

Since I have this in front of me now, I looked.  Can't find a peep about what I wrote above.  It may be there, but I couldn't find it.  I found this where I expected to find that

               // decode a frame asychronously (returns immediately)
               //  - if input bitstream contains multiple frames DecodeFrameAsync starts decoding multiple frames, and removes them from bitstream
               sts = m_mfxDecPtr->DecodeFrameAsync(&mfxDecBS, m_surfacesDecVppPtrPtr[nIndex], &pmfxOutSurface, &syncDec);
               // Ignore warnings if output is available
               // if no output and no action required just repeat the DecodeFrameAsync call

Which is sort of the opposite of what I wrote first, in that if you send it more than a single frame it starts decoding them.  Strike this if it only adds to the confusion.  I am not sure "...if input bitstream contains multiple frames..." is my comment, or a quote from a sample.

I have it working fine here, so this is not a problem for me.  I think I run one frame behind for my h264 input. I queue the frame's metadata and assemble it again on the way out (transcode op); I sync metadata to output by the frame's timestamp so even if it need even more frames, I'd still be okay.

OK, I did find it.

               // 30-Oct-2013: in case I did not append scprefix/sc (SEI will do) then add it here
               // (possibly duplicating but maxnix since stops after first non-sequence bits)
               mfxDecBS.Data[buffBytes + 0] = 0;
               mfxDecBS.Data[buffBytes + 1] = 0;
               mfxDecBS.Data[buffBytes + 2] = 0;
               mfxDecBS.Data[buffBytes + 3] = 1;
               mfxDecBS.Data[buffBytes + 4] = 6;  // need to 'end it' with something known to not be SPS/PPS start code else MFX_ERR_MORE_DATA (-10)
               mfxDecBS.DataLength = buffBytes + 5;

But not much in way oi explanation.  From what I wrote, I'd say the decoder keeps looking beyond the current frame until it knows the next is not SPS/PPS.

But for the OP's problem, if he's providing an entire GOP of frames, then my first comment above would better fit, assuming that comment is true.

Let me elaborate on ”consumes two frames”.

I have a pointer with two GOPs worth of data with a total length of 241481 bytes. The first frame in the data (an IDR-frame) has a length of 42578 bytes. The second frame, a P-frame with the length of 4051 bytes, comes immediately after the first frame. And so forth.

When I call DecodeFrameAsync with this data, the “Offset” value in the mfxBitstream will be set to 46629 ( = 42578 + 4051), when that one call has executed.

All the subsequent call to DecodeFrameAsync will only increment the Offset value by one frames worth of data.

I know of course that this is not a bug, but perhaps simply for the convenience of the user. Unfortunately, it is a problem for me, since (a lot of) other code relies on the decoder never taking more than one frame from the given input in one call to decode frame.

Is there a way to configure it to never take more than one frame from the given input? Or alternatively tell how many frames worth of data it has taken in a call?

Thank you for your support.

Tony Pabon (Intel)'s picture

Ah! Thanks you. I see what you are asking now. (You're wanting the 'offset' of input bitstream to setup one frame at a time)

I believe the answer is 'no', there is no way to limit how much of the supplied bit stream is actually used in a single call.  I think the hardware decoder does this to check timestamps or framerate,  Using low latency configurations (see video conferencing white paper) might change HW behavior, but I need to check with experts.

-Tony
Tony Pabon (Intel)'s picture

Answer from expert was "the only way is to give not more than one frame".

 

-Tony

Login to leave a comment.