decoder bugs

decoder bugs

Using MSDK 3.0 beta 2 I've run into the following two bugs:

(1) The "DecodeHeader" API returns MFX_ERR_MORE_DATA when being called with the following h264 header sequence:

00 00 00 01 67 64 00 33 AC 34 E6 02 C0 49 FB 84 00 00 0F A0 00 03 0D 42 3C 60 C6 68
00 00 00 01 68 EE BC B0

However, if I just add an AUD (00 00 00 01 09 10) to the end of that data, "DecodeHeader" is happy to do its job. Your own "MP4/AVC Decode using the Intel Media SDK" article/demo stumbles over the same problem. In that demo DecodeHeader always fails.

(2) After seeking in a VC-1 video stream, the very first frame output by your (software) decoder often contains block artifacts. The blocks are usually in areas where there's strong motion. The very next frame is clean. It's only the first frame which shows these artifacts. And it doesn't always occur, but very often. It occurs with and without "injected headers". If you need a sample to reproduce this problem, just let me know. I've seen the same problem with 2 very different files, though (one SD interlaced, the other Blu-Ray 24p), so I'm pretty sure it will happen with all files.

Here's a screenshot of the problem:

32 posts / novo 0
Último post
Para obter mais informações sobre otimizações de compiladores, consulte Aviso sobre otimizações.

Hi,

We have recieved your questions and are looking into them. For your second question on VC1 video seek, could you provide the sample you mentioned? It might also help to know a little more about howthe file was created(if there were any additional steps on your part) andwhich players display the problem. Also, do you see the same artifacts if you decode straight to a YUV?

Thanks,

Jeff

Hi Jeff,

thanks for the quick reply. The Intel Media SDK support is second to none!! :)

To answer your questions:

(1) The file is an interlaced SD m2ts, directly taken from a Blu-Ray. Of course I've verified that the problem only occurs with the Intel decoder. It does not occur with the Microsoft VC-1 decoder, so the file doesn't seem to be damaged.

(2) Here's a sample. I've used a hexeditor to cut out a section of the file (of course I've left the 192 byte m2ts container structure intact). Please let me know when you've downloaded the sample, so I can remove it from my server again. Thanks:

http://madshi.net/intelvc1.m2ts

(3) The problem occurs with MPC-HC, PotPlayer and also with GraphEdit. So it's not player related.

(4) Most DirectShow splitters don't handle interlaced VC-1 streams in a way that the Intel decoder likes. As a result, if headers are injected after a seek, heavy image corruption can occur. I've worked around this issue by remuxing the m2ts file into an MKV file ("eac3to intelvc1.m2ts 1: intelvc1.mkv") and then splitting the MKV file with the Haali MKV Splitter. This way the Intel decoder is happy with the splitter output and injecting the headers doesn't have any negative side effects, anymore.

(5) I've been testing with my own VC-1 DirectShow decoder using the Media SDK. But the same problems also occur with the official Intel VC-1 DirectShow filter which is installed by the Media SDK. In GraphEdit: "Haali Media Splitter (intelvc1.mkv) -> Official Media SDK Intel VC-1 Decoder -> Video Renderer".

Of course you can also use a different splitter, but I've found that using anything other than described above increases the problems instead of reducing them. The Intel VC-1 decoder seems to be very picky about which bitstream elements are grouped together for one "Decode" call. Which is ok, I guess, it just makes testing more complicated.

In order to reproduce the problem, I'd suggest to remux the m2ts file to MKV, then use GraphEdit, with "Haali Media Splitter (mkv) -> Intel VC-1 DirectShow decoder -> Video Renderer". Start playback, then pause playback, then in paused state seek around. Sometimes you'll get a blocky picture, sometimes not.

Edit:

> do you see the same artifacts if you decode straight to a YUV?

I've tested with 2 different video renderers, with NV12 output from the Intel decoder. The problem occurred with both renderers.

I've got the file. Thanks to your detailed response we will have plenty to work with to reproduce the problem here.

Hmmmm... Just tested: The problem still occurs even if I completely destroy and recreate the decoder. My current best guess is that the decoder maybe doesn't like getting a non-key frame as the first frame to decode. Although I would guess that after a seek the splitter would send a key frame first. So I'm not really sure what's going on. Well, I'll leave this to you now. If there's anything I can help with, just let me know.

My apologies for the delay.

In response to (1) The "DecodeHeader" API returns MFX_ERR_MORE_DATA:

In this case DecodeHeaderappears to beworking as intended. MFX_ERR_MORE_DATA indicates that DecodeHeader needs more data to proceed. The parser does not have enough information to know that all of the bytes of the NAL unit at the end of the bitstream are present because there is no start code for the next NAL unit yet. (As you know, an AUD is one of many sequences that can indicate the end of one NAL unit and the beginning of another it is not required.) After reading more data the next call to DecodeHeader shouldhave enough information to proceed.

Though classified as an error, MFX_ERR_MORE_DATA might be better described as a status message indicating that something needs to be done in the program (i.e. read more data) before continuing.

We will to continue expanding and improving the SDK documentation to make it as clear as possible. Your feedback is appreciated and we will keep it in mind for future releases.

For this answer I assumed that your question was about general decode behavior. If there are concerns with not parsing a specific file correctly we will be glad to look into this further.

Regarding (2) After seeking in a VC-1 video stream, the very first frame output by your (software) decoder often contains block artifacts:

Weve reproduced the block artifacts and determined that there are some things for us to fix to be compatible with the Haali Media Splitter and other possible VC-1 sources. Updates to the VC-1 decoder filter may be available in a future release, though a fix may not be prioritized immediately. In the meantime, the WM ASF reader was tested more thoroughly as a source for the VC-1 decoder filter sample and may provide better results. As a reminder, these filters are intended as samples and not as production-ready components.

Here is a workaround if you would like to try updating the VC-1 decoder filter yourself: Comment out the copy of m_pVC1SeqHeader into the data buffer passed to m_pDecoder->RunDecode in CVC1DecVideoFilter::Receive (vc1_dec_filter.cpp). In my tests this ended up breaking compatibility with the WM ASF reader but enablingthe Haali Media Splitter pipeline(intelvc1.mkv->Haali Media Splitter->Intel Media SDK VC-1 Decoder->EVR) to work with playback from the beginning as well as arbitrary seeks.

I hope this helps. Please let us know if you have any more questions.

Regards,

Jeff

Thanks for your reply!

DecodeHeader: In DirectShow the sequence headers are stored as part of the media type information structure. A DirectShow decoder filter has to decide whether to accept or decline a pin connection request, based only on the media type information. The DirectShow filter doesn't have access to the full video bitstream at this point in time. Which means that the DS filter has no way to provide MORE_DATA during pin connection negotiation. If you don't provide a way for DecodeHeader to work with just the sequence headers (as they are stored in the media type information) then every DirectShow filter using the Media SDK will have to either hack around the problem by adding an AUD (and this won't work for MPEG2 and VC-1), or implement its own video bitstream parsing code.

VC-1 decoding/seeking: There seems to be a misunderstanding here. I've reported two different problems:

(1) With some DirectShow splitters, injecting VC-1 headers results in heavy image corruption. This is IMHO a fault of the DirectShow splitters.
(2) After a seek, the Media SDK's VC-1 decoder sometimes produces block artifacts in the first decoded frame.

The bug I'm concerned with is (2) and not (1). If you re-read my earlier comments, I've already explained how to work around problem (1). You're right that commenting out "m_pVC1SeqHeader" fixes (1). But so does remuxing the m2ts file to MKV by using eac3to, as I suggested earlier.

Even after the "m_pVC1SeqHeader" fix, problem (2) still occurs. It is less obvious, though, and doesn't occur with every seek. Problem (2) has nothing to do with your DirectShow VC-1 decoder sample or with the way VC-1 headers are injected. It is a problem with the Media SDK's VC-1 decoder itself.

Thanks for your clarification about where the current behavior of DecodeHeader is problematic.We will look deeper into the pin negotiation scenario you've described.

Yes, there aremultiple issues with VC1 decode seeking. We are currently discussing ways to make the filters more general as well as ways to improve our stream reposition testing of the decoders themselves.

Since the DirectShow filters are samples we only expose them to limited testing. In the case of VC-1 our development has onlyfocused onone pipeline:

VC1 in WMV container->(WM ASF Reader)->(MSDK VC-1 Decoder)->(Enhanced Video Renderer)

Thefilter samplesare not intended to be production ready general purpose solutions, though we are interested in making them better.

Leaving aside the seek issues in the VC1 decoder itself (which are being investigated separately), are these adequate summaries ofthe problems you've described so far?

1) You need a way for DecodeHeader towork with meaningful error codes in an environment such as pin negotiation, where the rest of the stream is not available yet.

2) The header injection model currently used by the MSDK VC-1 decode filter (likely an adaptation to theWM ASF Reader) does not work well with other sources.

3) There may be some additional parsing issues, wherewe require abitstream element order not required by the spec.Just to clarify -- have you seen this only with VC-1 or with Mpeg2 and H.264 as well?

Is there anything else?

Wewill probablyhave more questions as we investigate further. Thanks for your help with identifying these issues.

Jeff

> are these adequate summaries ofthe problems
> you've described so far?

1) and 2): yes.

Personally, I can work around the issues 1) and 2), though. For me the most crucial problem is 3), because I don't know how to work around it.

> 3) There may be some additional parsing
issues
> wherewe require abitstream element order not
> required by the
spec.

I think 3) is an additional issue, but I'm not sure what the cause it. It's not really a seeking issue, I think. I've tried to completely destroy and recreate all MSDK stuff with every seek and the problem still occurs. So I think the real issue is that sometimes the very first decoded frame shows blocks. Not sure why.

> Just to clarify -- have you seen this only with
> VC-1 or with
Mpeg2 and H.264 as well?

I've not seen it with Mpeg2. I do have seen a similar issue with H.264, though. It could actually be exactly the same issue. It's also always the first frame which shows blocks, the very next frame is always perfectly clean.

> Is there anything else?

There might be an MPEG2 problem, totally separate from the issues mentioned so far, but I've not yet analyzed it in more detail. Basically at the start of one MPEG2 sample I'm getting visible artifacts with the Intel software decoder, but not with libav/ffmpeg. But let me double check this before "officially" reporting it as a bug.

> Thanks for your help with identifying these issues.

Thanks for your support - I appreciate it!!

Best regards, Mathias.

Ok, I've double checked the MPEG2 issue. Here's a sample and two screenshots:

sample: http://madshi.net/intelMpeg2.mkv
screenshot first frame Intel decoder: http://madshi.net/intel.jpg
screenshot first frame libav decoder: http://madshi.net/libav.jpg

I have to say, though, that the Microsoft MPEG2 decoder shows the same artifacts as yours does. So it's possible that there's some kind of problem with the MPEG2 bitstream itself. However, libav/ffmpeg handles this very nicely, with no visible artifacts. I'm not sure if this is worth investigating on your side. I've leave that up to you.

Sorry for triple posting, but I think I found a good way for you to reproduce the VC-1 problem I'm most concerned with. I've uploaded a new sample for you here:

http://madshi.net/intelvc1.mkv

This MKV seems to start with a key frame. When using the MSDK software decoder, the first 5 decoded frames are these:

frame 1: http://madshi.net/intel1.jpg
frame 2: http://madshi.net/intel2.jpg
frame 3: http://madshi.net/intel3.jpg
frame 4: http://madshi.net/intel4.jpg
frame 5: http://madshi.net/intel5.jpg

As you can see, the first two frames come out garbled. Now the interesting bit: If I playback this video with the Microsoft VC-1 decoder, the first 3 frames are these:

frame 1: http://madshi.net/ms1.jpg == intel3.jpg
frame 2: http://madshi.net/ms2.jpg == intel4.jpg
frame 3: http://madshi.net/ms3.jpg == intel5.jpg

Hope this makes it easier for you to pinpoint the cause of the problem.

Hi Mathias,
Thanks for such accurate bug reports and help with reproduction. Jeff and I will look into this.
Regards,Nina

I've some new information. I've been working on implementing the libav/ffmpeg decoders into my DirectShow renderer, and I've learned a few things that might be helpful for you, too:

(1) In DirectShow splitters often deliver a couple of "preroll" samples with a negative timestamp, before the real seek point sample is delivered with a positive timestamp. The preroll samples are meant to be decoded (e.g. in order to initialize the decoder properly), but they're usually not displayed. Your DirectShow filter samples in the MSDK have this code in frame_constructors.cpp:

rtStart = (rtStart < 0) ? 0 : rtStart;

Basically samples with a negative timestamp are converted to a 0 timestamp. That's a bad idea, because it practically converts all preroll samples to real samples. So the preroll samples will probably be displayed by the video renderer, even though they shouldn't. I would suggest to simply remove the code quoted above. Works fine for me.

(2) The libav decoder also sometimes outputs a first corrupted frame, followed by proper frames, similar to what the MSDK decoder does. The first corrupted frame might even have a positive timestamp. However, the libav decoder marks frames as being key frames or not. I've found that if the first decoded frame is corrupted, it's usually not a key frame. So I've simply added code to my renderer to start displaying frames only with the first decoded key frame which has a positive timestamp. This seems to work very well. Unfortunately I don't see any way with the MSDK to implement a similar solution because I don't see how I could know whether a decoded frame is a key frame or not. So my suggestion is this: At start of decoding and after a reset/seek, either let the MSDK silently drop all decoded frames, until the first key frame with a positive timestamp is decoded. Or alternatively offer a way for MSDK users to find out whether a decoded frame is a key frame or not. E.g. you could add a "bool key_frame" to the mfxFrameInfo structure. Doing either of this might already fix the VC-1 problem "3)" we talked about earlier. Not sure, though. The VC-1 problem might still be something different.

Hi Mathias,
Great thanks for sharing these additional results! I will check your suggestions and follow up.
Nina

Hi Mathias,
Sorry for delayed answer, seems we missed your first question behind the reposition discussion - the one about DecodeHeader. I have a suggestion for you - if you are sure that you are providing full frame or full header in the input bitstream you may set the flag DataFlag = MFX_BITSTREAM_COMPLETE_FRAME. Then decoder will not ask for more data. Otherwise it needs next start code to understand that the header is finished.
We will also do our best to include some fixes for the repositioning into our next release.
Thank you,Nina

Thanks Nina,

"DataFlag = MFX_BITSTREAM_COMPLETE_FRAME" sounds like a good solution to me!

Looking forward to the fixes you're mentioning. I'm not in a hurry, though, so no problem if it takes some time.

FWIW, I've one more information to share: While implementing the libav/ffmpeg MPEG2 decoder, I've found that some transport streams have pretty bad (swapped) timestamps, resulting in both libav/ffmpeg and MSDK decoders outputting swapped timestamps, too, resulting in non-smooth playback. If you are interesting in this problem, I can provide you with a sample.

Looking through already existing open source MPEG2 decoder implementations, it seems that the usual way to work around this issue is to use only the I-frame timestamps and to ignore (= interpolate) all other timestamps. That's the approach used by ffdshow, at least. For that to work it might be useful for the MSDK user to be able to find out which decoded frame is an I-frame and which is not. Or alternatively, you could probably also do the timestamp dropping & interpolation inside of MSDK, if you prefer that. Or maybe MSDK users could already solve this right now by feeding only I-frame timestamps to MSDK and feeding e.g. "-1" for P-frame and B-frame timestamps to MSDK? Would the MSDK then already do the interpolation for the missing timestamps? Haven't tried that yet...

Best regards, Mathias.

Why should the MSDK mess with the time stamps? Normally, a decoder core does not need to care about external timestamps, it just decodes the frames. You can decode a H264 elementary stream with the MSDK and there is obviouslyno timestamp info from a container header.
I think the MSDK just passes the time stamps you set.

@Markus, there's a difference between decoding order and presentation order. Video frames are usually not decoded in presentation order, but in decoding order. You feed MSDK (or libav/ffmpeg) with the bitstream in decoding order, but with presentation timestamps. Because decoding and presentation order differs, the presentation timestamps you feed MSDK/libav/ffmpeg with are not continuous. However, you need to feed the video renderer with frames in presentation order and with continuous presentation timestamps. The frame reordering is done by MSDK/libav/ffmpeg, including reordering of timestamps. And this already works just fine in most cases, just not with some MPEG2 transport streams, which sometimes have screwed up container timestamps.

I know the difference between decoding and presentation order, but that order is not defined by the container timestamp, but by the gop structure.The MSDK decoder like most decoders deliver the frames in presentation order as it is derived from the elementary stream.

If you don't believe it, just set the timestamp you pass to DecodeFrameAsync in mfxBitstream to 0 (or any other value)for all frames. And by some miracle from Intel, they come out in presentation order and mfxFrameSurface1.Data.TimeStamp has the same value for all frames.

So again, I would rather not have the MSDK mess with the TimeStamp.

Personally, I don't care much if the MSDK "messes" with the timestamps or not. All I care about is that there must be *some* solution for me to achieve smooth playback, even with broken transport streams. FWIW, in order to get continuous timestamps from the MSDK, I need to feed the MSDK with presentation timestamps for MPEG2 and h264, but with decoding timestamps for VC-1. If I feed the MSDK VC-1 decoder with presentation timestamps, I get swapped timestamps back from MSDK output. This is compatible behaviour to the Microsoft VC-1 decoder, though, so it's probably "good" this way. Anyway, it seems that the MSDK already does some timestamp processing, or "messing", as you call it. But as I said, all I care about is that there should be some way for me to achieve smooth playback. With libav/ffmpeg that's possible because libav/ffmpeg reports which decoded frame is an I-frame, so I can post process the timestamps accordingly. This doesn't seem to be possible with the MSDK at the moment.

If the container timestamps are wrong, but the stream does not have gaps, the safest way is to calculate the presentation timestamps yourself by counting the decoded frames and calculate the render time with the framenumber and the frame rate. You have to reset the frame count and possibly a start time offset on the appropriate DirectShow calls.

Unfortunately it's not as easy as that. After a seek, DirectShow splitters provide preroll frames with negative timestamps and since MSDK doesn't tell me which frame is an I-frame and which is not I don't really know at which decoded frame exactly to start with presentation timestamp 0. If I simply guess, audio/video sync could be off by more than 1 frame. One proper solution to this problem would be to trust in I-frame timestamps and interpolate the timestamps for B- and P-frames, which is what the well known open source DirectShow decoder filter "ffdshow" does. But that's only possible if I know which decoded frame is an I-frame and which is not. libav/ffmpeg tells me that, the MSDK does not.

For seeking, the DirectShow splitter needs the container time stamp. Does seeking work correct on a stream with bogus time stamps? If it does, the first frame to send to the renderer is the one with timestamp greater or equal 0. It does not need to be an I-frame.

Markus, I appreciate that you're trying to help. But you seem to misunderstand the purpose of my thread. I'm not looking for advice/help here. I know the problem, I know what possible solutions there are. Everything you told me so far I already knew before. The purpose of this thread is to give feedback to Intel, so that they can improve the MSDK.

In case you want to know: Your latest suggestion would work "ok", but not really well. If the timestamps are borked, then the first timestamp >= 0 could already be a bogus one. If you sync presentation start to the first decoded frame with a >= 0 timestamp, regardless of whether it's an I-, P- und B-frame, audio/video sync will be all over the place after every seek. If you sync presentation only to I-frames, there might still be a certain audio/video desync, but at least it should be constant, during playback and after every seek. Which is crucial to the end user.

I understand the purpose of the thread. I just wanted to make clear that maybe except from VC-1, there is no decoder bug and the timestamp handling of MSDK is correct. All fixes/workarounds of wrong containers is a matter of the shim around it. We are already using MSDK 2in production for decoding ts and mov containers and seeking works even for damaged files (like ts from interrupted DVB).

The MSDK timestamp handling works as intended, as long as the container timestamps are correct. All I'm saying is that in certain situations there's a need to work around badly muxed video files and the MSDK doesn't make that as easy as libav/ffmpeg does. I think it's not a bad idea for the MSDK to learn from libav/ffmpeg, after all libav/ffmpeg is the defacto standard decoding library, used by the broad majority of media related software. What makes timestamp fixing so much easier with libav/ffmpeg is that libav/ffmpeg reports which decoded frame is which type (I, P, B, ...), which the MSDK does not seem to do yet. So my main suggestion was for the MSDK to also report which decoded frame is which type.

BTW, the TS files with bad timestamps are not "damaged". They are just muxed with bad timestamps. Playing damaged/corrupted files is a completely different thing compared to playing files with consistently bad container timestamps.

The MSDK decode/encodestages generallyjust passtimestampsalong, as mentioned previously.VPP frame rate conversion might alter some timestamps, but there are currently no plans to support any other form of interpolation or cleanup.

Theidea behind the design, which we've verified with the MSDK architect,is that timestampprocessing should be at thecontainer level.

In terms of the DirectShow filters this putsa lot of responsibilityon the splitter.We are currently discussing plans to improve the sample splitters for cases like those mentioned in this thread. This may also include distributing the source code so that things like custom timestamp cleanup algorithms can be added. Unfortunately we can't make any guarantees about the contents of future releases, or when these changesmight be available, but this does give us a lot of good things to consider.

The forum is definitely a great way to provide feedback.Conversations like this are very helpful as we work toward a better product and better documentation.

Regards,

Jeff

@Jeff, thanks for your reply.

I think it's a good concept to have the MSDK just pass timestamps along. I also agree that in an ideal world the splitter would take care of fixing broken timestamps. Unfortunately I'm not aware of a single existing DirectShow splitter filter which fixes broken timestamps. In real life fixing broken timestamps is either not done at all, or it's done by the decoder filter. E.g. the most important open source video decoders "ffdshow", "DScaler" and "LAV Video Decoder" are all doing timestamp fixing internally, but none of the splitters (neither open source nor commercial) do that, as far as I can tell.

Practically that means that a DirectShow decoder filter (regardless of whether it's using the MSDK or not) must be able to internally fix the timestamps, if it wants to play problematic video files smoothly with the existing DirectShow splitter filters. That doesn't mean that the MSDK has to do any sort of timestamp fixing itself. It just needs to provide the developer with enough information to perform the necessary fixes. I think reporting which decoded frame is which type (I, P, B) would already do the job.

You've provided some excellent reasons to consider providing frame type as an output from the decoders. We're exploring options we might have to include this feature in the future. This information could potentially be useful in a lot of cases beyond fixing timestamps.

Unfortunately we can't make any guarantees, but we'll do our best to make sure youruse case isrepresented.

In the meantime, how significantly islack offrame typeinformation affecting your projects? If this needs more attention in the short term we'll be glad to help with finding other workarounds.

Thanks for bringing this up.

Jeff

Thanks Jeff, for the excellent support.

In the short term I can live with the way things are.

Sorry to bump this old thread but I seem to have run into a similar problem with 1080i VC1 content. Using the sample_decode.exe application (SW decode on Win7 x64) from "MSDK-2012_pu_3.0.015_R2", I get strange block artifacts at the start of playback (see attached screenshot). This is a sample form the popular HQV Benchmark Blu-Ray so I assume there are no encoding errors. It also shows up correctly with other decoders I have tried. Here is a short clip for you to test:http://www.filefactory.com/file/151ea2u29gkb/n/FilmRes_vc1 (select slow download at the bottom of page).
-Mark

Hi Mark,

We were able to reproduce the issue but as you point out the artifacts are only visible during SW decode. The content decodes fine using recent graphics drivers.

Thanks for reporting the issue, we will work on fixing this SW decode bug as soon as possible.

Regards,Petter

Faça login para deixar um comentário.