Scaling Operations in Intel Media SDK

By Surbhi Madan,

Published:09/16/2014   Last Updated:09/16/2014

This article details all the scaling operations that are present in the Media SDK, which is a component in Intel® Media Server Studio and Intel® INDE. Scaling is one of the most commonly used video processing operation. The application can specify a region of interest for each video using Video Processing Pipeline(VPP). Multiple scaling operations can be achieved using the Media SDK VPP, here we are describing two most often used operations with their results.

  1. Cropping
  2. Re-sizing

Below is the basic flow chart showing the pipeline of the functions being to achieve scaling- 

Find free frame surface - Surfaces are currently being used from the surface pool, locked surface cannot be used, so a search is done to find an unused surface.

int GetFreeSurfaceIndex(mfxFrameSurface1** pSurfacesPool, mfxU16 nPoolSize)
{
    if (pSurfacesPool)
        for (mfxU16 i = 0; i < nPoolSize; i++)
            if (0 == pSurfacesPool[i]->Data.Locked)
                return i;
    return MFX_ERR_NOT_FOUND;
}

Load raw frame into surface- Raw frame is read both luminance and chrominance plane and then loaded into the surface.  

mfxStatus LoadRawFrame(mfxFrameSurface1* pSurface, FILE* fSource)
{
    w = pInfo->Width;
	    h = pInfo->Height;
    pitch = pData->Pitch;
	    ptr = pData->Y;
    //read luminance plane
    for (i = 0; i < h; i++) {
        nBytesRead = (mfxU32) fread(ptr + i * pitch, 1, w, fSource);
        if (w != nBytesRead)
            return MFX_ERR_MORE_DATA;
    }
    mfxU8 buf[2048];        // maximum supported chroma width for nv12
    w /= 2;
    h /= 2;
    	ptr = pData->UV;

    // load U
    sts = ReadPlaneData(w, h, buf, ptr, pitch, 0, fSource);
    if (MFX_ERR_NONE != sts)
        return sts;
    // load V
    ReadPlaneData(w, h, buf, ptr, pitch, 1, fSource);
    if (MFX_ERR_NONE != sts)
        return sts;
}

Write frame to the output surface - After RunFrameVPPSync operation which process the frame asynchronously. Sync operation is called to get the all outputs, so that is available to write back into raw frame for the output.

mfxStatus WriteRawFrame(mfxFrameSurface1* pSurface, FILE* fSink)
{
    mfxFrameInfo* pInfo = &pSurface->Info;
    mfxFrameData* pData = &pSurface->Data;
    mfxU32 i, j, h, w;
    mfxStatus sts = MFX_ERR_NONE;

	    for (i = 0; i < pInfo->Height; i++)
		    sts =
    WriteSection(pData->Y, 1, pInfo->Width, pInfo, pData, i, 0,
		    fSink);

			    h = pInfo->Height / 2;
			    w = pInfo->Width;

    for (i = 0; i < h; i++)
        for (j = 0; j < w; j += 2)
            sts =
                WriteSection(pData->UV, 2, 1, pInfo, pData, i, j,
                             fSink);
    for (i = 0; i < h; i++)
        for (j = 1; j < w; j += 2)
            sts =
                WriteSection(pData->UV, 2, 1, pInfo, pData, i, j,
                             fSink);
    return sts;
}

We will be using sample_vpp from the tutorials which you can find on Media Solution Portal Page. Using "foreman.yuv" as an input which you can get it from here. Once you download the file, it would be in .y4m format and can be converted to YV12 format using ffmpeg.

ffmpeg -i input.y4m output.yuv

There are six VPP parameters which are used for scaling operations.

  • CropX, CropY, CropW, CropH defines the location of the input and the output frame, which needs to be explicitly defined to achieve desired result.
  • Width and Height should be defined explicitly defined for both input and output. Also height and width should be a multiple of 16 in case of a frame picture and height needs to be a multiple of 32 in case of field picture. 

Cropping- This is one of the most commonly used video processing operation which is often used to define region of Interest (ROI). This can help you achieve change in aspect ratio of the video as well . Most common changes are 16:9->4:3 and 4:3->16:9. With the change in aspect ratio, we can introduce pillar and letter boxing. Below is the table for the input options can be used to achieve cropping, letter boxing and pillar boxing in sample_vpp tutorial. 

  CropX CropY  CropW CropH Width Height
Input  128 128 1024 464 1280 720
Output_Crop 0 0 1024 464 1024 464
Output_PillarBoxing 128 0 1024 720 1280 720
Output_LetterBoxing 0 128 1280 464 1280 720

These are the output you will get using the above input parameters- 

Re-sizing- This is another video processing operation used to re-size the video to get the desired size of the video. However, re-sizing does not change the image by any way, it just changes the resolution of the output to print. Re-sizing can also be done in single dimension by re-sizing the with or height of the video, this is also known as stretching in horizontal or vertical direction. Below is the table, showing the input options can be used to achieve re-sizing and stretching in sample_vpp tutorial. 

  CropX CropY  CropW CropH Width Height
Input  0 0 640 480 640 480
Output_Re-size 0 0 1280 720 1280 720
Output_VerticalStretch 0 0 640 608 640 608
Output_HorizontalStretch 0 0 720 480 720 480

These are the output you will get using the above input parameters- 


You can find more detail about the parameters and the functions used in the user manual which comes in document on the installed folder of MSDK or can be downloaded from here.

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.