I've got an application where I actively grab frames from a video camera, which I want to encode. If my system is under computational load the framerate will drop (since I'm actively grabbing). Now I want to cope with this variable frame rate (VFR) during my encoding.
Since I'm recording a live video stream I process each frame individually after grabbing. I've got a working encoding pipeline which I've extended with VPP for handling VFR. I set a timestamp for each input frame to VPP, but the encoded stream does not seem to be affected. Here are my params for VPP (they are the same for encoding):
mfxVPPParams.vpp.In.FourCC = MFX_FOURCC_NV12; mfxVPPParams.vpp.Out.FourCC = MFX_FOURCC_NV12; mfxVPPParams.vpp.In.Width = inputWidth; mfxVPPParams.vpp.Out.Width = inputWidth; mfxVPPParams.vpp.In.Height = inputHeight; mfxVPPParams.vpp.Out.Height = inputHeight; mfxVPPParams.vpp.In.ChromaFormat = MFX_CHROMAFORMAT_YUV420; mfxVPPParams.vpp.Out.ChromaFormat = MFX_CHROMAFORMAT_YUV420; mfxVPPParams.vpp.In.PicStruct = MFX_PICSTRUCT_PROGRESSIVE; mfxVPPParams.vpp.Out.PicStruct = MFX_PICSTRUCT_PROGRESSIVE; mfxVPPParams.vpp.In.CropX = 0; mfxVPPParams.vpp.Out.CropX = 0; mfxVPPParams.vpp.In.CropY = 0; mfxVPPParams.vpp.Out.CropY = 0; mfxVPPParams.vpp.In.CropW = inputWidth; mfxVPPParams.vpp.Out.CropW = inputWidth; mfxVPPParams.vpp.In.CropH = inputHeight; mfxVPPParams.vpp.Out.CropH = inputHeight; mfxVPPParams.vpp.In.Width = MSDK_ALIGN16(inputWidth); mfxVPPParams.vpp.Out.Width = MSDK_ALIGN16(inputWidth); mfxVPPParams.vpp.In.Height = (MFX_PICSTRUCT_PROGRESSIVE == mfxVPPParams.vpp.In.PicStruct) ? MSDK_ALIGN16(inputHeight) : MSDK_ALIGN32(inputHeight); mfxVPPParams.vpp.Out.Height = (MFX_PICSTRUCT_PROGRESSIVE == mfxVPPParams.vpp.In.PicStruct) ? MSDK_ALIGN16(inputHeight) : MSDK_ALIGN32(inputHeight); mfxVPPParams.vpp.In.FrameRateExtD = 1; mfxVPPParams.vpp.Out.FrameRateExtD = 1; mfxVPPParams.vpp.In.FrameRateExtN = framerate; mfxVPPParams.vpp.Out.FrameRateExtN = framerate; mfxVPPParams.IOPattern = MFX_IOPATTERN_IN_SYSTEM_MEMORY | MFX_IOPATTERN_OUT_SYSTEM_MEMORY;
I'm using the number of surfaces suggested for encoding. For VPP the suggested number is 5 (both in and out). However, for VPP I'm only using one, since I'm processing all frames one by one (I can see why encoding needs more surfaces but for VPP in this setting I don't). I fetch a free surface for encoding and call VPP as follows:
mfxVideoVPP->RunFrameVPPAsync(vppSurface, pEncSurfaces[nEncSurfIdx], NULL, &syncp);
My questions are:
- Is this the way to go, using VPP to handle VFR? What are my options?
- I've stumbled over some piece of information stating that timestamping for VFR is a thing that the container handles. Is this true? Could it be that my container (.mp4 from MP4Box) disregards the timestamps?
- Should I use 5 surfaces for VPP after all? If I should use all 5 surfaces for VPP, how do I connect VPP with encoding? (Are the out surfaces the same as the encoding surfaces? Should therefore the number of encoding surfaces be the maximum of suggested VPP and suggested encoding surfaces?)
- Are there possibly some other mistakes in my reasoning or in my code?
I'm very happy about any hints you could provide! I'm aware of section 4.9.4 from the dev guide as well as the MFXVideoVPPEx class, but I can't seem to find the answers to my questions there.
Thanks a lot in advance!