Scaling Operations in Intel Media SDK

This article details all the scaling operations that are present in the Media SDK, which is a component in Intel® Media Server Studio and Intel® INDE. Scaling is one of the most commonly used video processing operation. The application can specify a region of interest for each video using Video Processing Pipeline(VPP). Multiple scaling operations can be achieved using the Media SDK VPP, here we are describing two most often used operations with their results.

  1. Cropping
  2. Re-sizing

Below is the basic flow chart showing the pipeline of the functions being to achieve scaling- 

Find free frame surface - Surfaces are currently being used from the surface pool, locked surface cannot be used, so a search is done to find an unused surface.

int GetFreeSurfaceIndex(mfxFrameSurface1** pSurfacesPool, mfxU16 nPoolSize)
{
    if (pSurfacesPool)
        for (mfxU16 i = 0; i < nPoolSize; i++)
            if (0 == pSurfacesPool[i]->Data.Locked)
                return i;
    return MFX_ERR_NOT_FOUND;
}

Load raw frame into surface- Raw frame is read both luminance and chrominance plane and then loaded into the surface.  

mfxStatus LoadRawFrame(mfxFrameSurface1* pSurface, FILE* fSource)
{
    w = pInfo->Width;
	    h = pInfo->Height;
    pitch = pData->Pitch;
	    ptr = pData->Y;
    //read luminance plane
    for (i = 0; i < h; i++) {
        nBytesRead = (mfxU32) fread(ptr + i * pitch, 1, w, fSource);
        if (w != nBytesRead)
            return MFX_ERR_MORE_DATA;
    }
    mfxU8 buf[2048];        // maximum supported chroma width for nv12
    w /= 2;
    h /= 2;
    	ptr = pData->UV;

    // load U
    sts = ReadPlaneData(w, h, buf, ptr, pitch, 0, fSource);
    if (MFX_ERR_NONE != sts)
        return sts;
    // load V
    ReadPlaneData(w, h, buf, ptr, pitch, 1, fSource);
    if (MFX_ERR_NONE != sts)
        return sts;
}

Write frame to the output surface - After RunFrameVPPSync operation which process the frame asynchronously. Sync operation is called to get the all outputs, so that is available to write back into raw frame for the output.

mfxStatus WriteRawFrame(mfxFrameSurface1* pSurface, FILE* fSink)
{
    mfxFrameInfo* pInfo = &pSurface->Info;
    mfxFrameData* pData = &pSurface->Data;
    mfxU32 i, j, h, w;
    mfxStatus sts = MFX_ERR_NONE;

	    for (i = 0; i < pInfo->Height; i++)
		    sts =
    WriteSection(pData->Y, 1, pInfo->Width, pInfo, pData, i, 0,
		    fSink);

			    h = pInfo->Height / 2;
			    w = pInfo->Width;

    for (i = 0; i < h; i++)
        for (j = 0; j < w; j += 2)
            sts =
                WriteSection(pData->UV, 2, 1, pInfo, pData, i, j,
                             fSink);
    for (i = 0; i < h; i++)
        for (j = 1; j < w; j += 2)
            sts =
                WriteSection(pData->UV, 2, 1, pInfo, pData, i, j,
                             fSink);
    return sts;
}

We will be using sample_vpp from the tutorials which you can find on Media Solution Portal Page. Using "foreman.yuv" as an input which you can get it from here. Once you download the file, it would be in .y4m format and can be converted to YV12 format using ffmpeg.

ffmpeg -i input.y4m output.yuv

There are six VPP parameters which are used for scaling operations.

  • CropX, CropY, CropW, CropH defines the location of the input and the output frame, which needs to be explicitly defined to achieve desired result.
  • Width and Height should be defined explicitly defined for both input and output. Also height and width should be a multiple of 16 in case of a frame picture and height needs to be a multiple of 32 in case of field picture. 

Cropping- This is one of the most commonly used video processing operation which is often used to define region of Interest (ROI). This can help you achieve change in aspect ratio of the video as well . Most common changes are 16:9->4:3 and 4:3->16:9. With the change in aspect ratio, we can introduce pillar and letter boxing. Below is the table for the input options can be used to achieve cropping, letter boxing and pillar boxing in sample_vpp tutorial. 

 CropXCropY CropWCropHWidthHeight
Input 12812810244641280720
Output_Crop0010244641024464
Output_PillarBoxing128010247201280720
Output_LetterBoxing012812804641280720

These are the output you will get using the above input parameters- 

Re-sizing- This is another video processing operation used to re-size the video to get the desired size of the video. However, re-sizing does not change the image by any way, it just changes the resolution of the output to print. Re-sizing can also be done in single dimension by re-sizing the with or height of the video, this is also known as stretching in horizontal or vertical direction. Below is the table, showing the input options can be used to achieve re-sizing and stretching in sample_vpp tutorial. 

 CropXCropY CropWCropHWidthHeight
Input 00640480640480
Output_Re-size0012807201280720
Output_VerticalStretch00640608640608
Output_HorizontalStretch00720480720480

These are the output you will get using the above input parameters- 


You can find more detail about the parameters and the functions used in the user manual which comes in document on the installed folder of MSDK or can be downloaded from here.

For more complete information about compiler optimizations, see our Optimization Notice.

2 comments

Top
Surbhi M. (Intel)'s picture

Swati, 

You can do it by not changing the size of the o/p and keeping it same to the i/p width and height. So the only thing which needs to change is cropW and cropH.
If you still see any issues, please send us a detailed explanation of the question on the Forum and we would try to help you the best we can.  

Thanks,
-Surbhi

Swati S.'s picture

Surbhi, I am trying to crop image, but instead of getting actual size image, it is resulting into cropped image. And you mentioned WriteRawFrame() function. So basically is this need to use to write image again? Can't VPP do it by itself?

Add a Comment

Have a technical question? Visit our forums. Have site or software product issues? Contact support.