Timestamp issue with the h264 encoder & decoder

Timestamp issue with the h264 encoder & decoder

I wrote a GStreamer plugin to decode h264 data with intel media SDK.mfxBitstream.TimeStamp is passed for each frame, but the output timestamps frommfxFrameSurface1.Data.TimeStamp are not in the increasing order.MFXVideoDECODE_DecodeFrameAsync is used to decoder h264 frames.Example input:1. 0.02. 0.23. 0.44. 0.8Output timestamps: (actual)1. 0.02. 0.83. 0.24. 0.4Expected:1. 0.02. 0.23. 0.44. 0.8This happens only for some streams which are encoded with h264 encoder of intel media SDK.

9 Beiträge / 0 neu
Letzter Beitrag
Nähere Informationen zur Compiler-Optimierung finden Sie in unserem Optimierungshinweis.

Hi,

Media SDK will read frames from stream in stream order, not display order. So one way to get the results you report is if you just set an increasing TimeStamp in bitstream before calling DecodeFrameAsync then look at the timestamp order in the surface after SyncOperation.

If you open your stream (generated with Media SDK encode) in a tool such as Elecard StreamEye you can see that the frames are not located in display order.

To verify and use timestamps in a complete way you need to involve a demuxer (and muxer for encode) which carries the frame with corresponding timestamp in a container. In that way you can make sure each frame that is fed toDecodeFrameAsync has the correct timestamp (which will also align with display order).

hope this helps.

Regards,Petter

Bild des Benutzers Eric Gur (Intel)

Time stamps in video streams can come in two forms PTS (presentation time stamp) and DTS (decoder time stamp).When PTS is used (most common), the input frames have time stamps that describe the output frames. The order of decoded frames is different.Example:Input frames I (t=1) P (t=3) B (t=2)Output frames I (t=1) B (t=2) P (t=3)So the Media SDK ties the time stamps along with frames data. This is the correct behavior for PTS time stamps.DTS time stamps are less common and mostly exist in VC1 files. In this case the Media SDK can't know that it's dealing with DTS timestamps. It will still output time stamps in the same manner as PTS.A couple of solutions to your problem:1) Have the splitter/demuxer reshiffle the time stamps. If the time stamps at input are monotonic, they are DTS. PTS time stamps are never monotonic if your look at consecutive 4-8 frames.2) Do the analysis yourself.This seems easy, but some input timestamps are not valid or they may be corrupted. So you'll need a more robust way of doing this. You can look at my decoder implementation (Intel QuickSync Decoder) which does thathttp://sourceforge.net/p/qsdecoder/code

Eric Gur, Intel QuickSync Decoder Author Processor Client Application Engineer Intel Corp.

It looks like the behavior of the encoder has changed in the 2013 SDK release.  The 2013 encoder seems to make QuickTime happy when dealing with B frames by reordering the output stream into display order. 

2012 encoder ordering

pkt_pts_time=0.000000

pict_type=I

pkt_pts_time=0.066625

pict_type=B

pkt_pts_time=0.099937

pict_type=B

pkt_pts_time=0.033313

pict_type=P

pkt_pts_time=0.166563

pict_type=B

pkt_pts_time=0.199875

pict_type=B

pkt_pts_time=0.133250

pict_type=P

2013 Ordering

pkt_pts_time=0.033313

pict_type=I

pkt_pts_time=0.066625

pict_type=B

pkt_pts_time=0.099958

pict_type=P

pkt_pts_time=0.133292

pict_type=B

pkt_pts_time=0.166625

pict_type=P

pkt_pts_time=0.199958

pict_type=B

pkt_pts_time=0.233292

pict_type=P

 

Hi,

There should be no change with regards to timestamp handing/order between Media SDK 2012 and 2013. From your log it seems that you have configured the encoder differntly for the two cases, thus differnt timestamps. E.g. I B B P   vs.   I B P

Quicktime is known to be very sensitive when it comes to DTS timestamps so application logic will have to make sure DTS are set appropriately. That said, please explore the new Media SDK 2013 (API 1.6) DecodeTimeStamp parameter in mfxBitstream. If you use this parameter DTS will be computed automatically by the SDK based on the input timestamps.

Regards,
Petter

I'm fairly sure that in both cases (2012 and 2013 SDKs) that I'm using PRESET_BALANCED, so presumably the IBP selection logic has changed internally (although it's possible I used a different system to collect the samples and this somehow affected IBP selection). 

But the number of B frames isn't what I'm trying to demonstrate.  Instead, note the PTS timestamps.  In 2013 the PTS timestamps are ever increasing instead of jumping backwards as in 2012 (i.e. frames are written to file in display order), which seems to correct the jittery playback problem in Quicktime.

If I'm not doing anything in code to configure or use DecodeTimeStamp in API 1.6, does the default implimentation affect frame reordering using the H.264 encoder and H.264 mux?

Hi,

There should be not such radical change in the encoder behavior between 2012 and 2013 SDK. The only reasons I can think of is that you may be using SW encode in one case and HW encode in other, and also different set of encoder parameters, not I'm not sure that explains the ordering you are seeing.

As stated earlier, Quicktime is not as robust as some other players when it comes to timestamps. For timestamp handling that will likely work for majority of content, check out the Media SDK-FFmpeg integration section of the new Media SDK tutorial. The sample showcases how PTS is handled and how DTS is computed the generate containers that Quicktime can handle. Note that the new "DecodeTimeStamp" parameter provides an alternate approach (not showcased in sample).
http://software.intel.com/en-us/articles/intel-media-sdk-tutorial  

Also refer to this recent post, http://software.intel.com/en-us/forums/topic/369095, it may help in understanding how timestamps are handled for decode.

Regards,
Petter 

- This was a double post -

Hi,

Zitat:

Eric Gur (Intel) schrieb:

If the time stamps at input are monotonic, they are DTS.

This is totally right, but there is on exception : video streams which does not contain any B frame (mpeg 2 simple profile, or h264 baseline profile encoded streams for example)

Zitat:

Eric Gur (Intel) schrieb:

2) Do the analysis yourself. This seems easy, but some input timestamps are not valid or they may be corrupted. So you'll need a more robust way of doing this. You can look at my decoder implementation (Intel QuickSync Decoder) which does thathttp://sourceforge.net/p/qsdecoder/code

Before clearly understanding timestamp management in MSDK, I tried this approach myself. I agree with Eric, this analysis actually has to be robust. I have to deal with broadcasted streams which can be cutted anywhere, and I had a hard time guessing how many frames the decoder trashes before it outputs its first decoded frame. At this point, I guess the right way to do that is implementing a light elementary stream parser. 

Regards,

Melden Sie sich an, um einen Kommentar zu hinterlassen.