Intel® Media Software Development Kit - An Architectural Overview

Submit New Article

September 21, 2009 12:00 AM PDT


Introduction

The Intel® Media Software Development Kit (SDK) equips media developers with a standard programming interface for creating video solutions.   Hardware optimized Decode, Encode, and video preprocessing enable the developer with optimized routines for developing maximum video performance on a variety of platforms.  Developers can leverage future enhancements to Intel platforms by utilizing the Intel® Media SDK today.

This document is designed to give an architectural overview of the new API.  For further information on the topics discussed below, please reference the Intel Media SDK Reference Manual.

The Intel® Media SDK is packaged with many simple application samples that illustrate how to use the SDK to encode, decode and pre-process video.

The Intel® Media SDK is available for download here: www.intel.com/software/mediasdk.

Intel® Media SDK Architectural Overview

Figure 1 illustrates the Intel® Media SDK high level architectural diagram.

Figure 1: Intel® Media SDK Architectural Overview

The Intel® Media SDK programming interface is exposed to applications via the Media Library Dispatcher layer.   This static library is responsible for exposing the entry points for the encoding, decoding, and pre-processing functions that the Intel Media SDK provides.  This layer is also responsible for making decisions on whether the CPU or GPU is best suited to support the applications request by performing the following:

  1. The dispatcher identifies the active graphics device and driver.
  2. It then determines the most suitable implementation of the SDK to which it will redirect the function calls;
  3. If there is no suitable platform-specific implementation, the dispatcher will redirect SDK function calls to a software implementation that comes with the SDK.

The platform specific and software libraries export the same function entries.  This enables the dispatching layer to select the appropriate method while maintaining a very small overhead.

The CPU library is used if the application has requested a task that cannot be performed on the graphics hardware.  The CPU library is highly optimized using Intel® Streaming Instruction Set (SSE4) to ensure smooth playback or quick encoding.  The CPU library is often referred to as the “software fallback”.

The optimized media library for Intel® Integrated Graphics (IIG) and future Intel Discrete Graphics provide hardware decode acceleration via the Microsoft DirectX Video Acceleration Version 2 interface. The libraries abstracts the platform specific requirements, thus the developers time investment to accelerate content is greatly reduced with the SDK. In addition, the Intel Media SDK provides a standardized interface for encoding video which is not found in the DirectX Video Acceleration API.

The Intel Media SDK libraries and dispatcher layers work together to intelligently determine the most efficient method to perform the requested tasks.  Developers can be assured they are getting the best playback experience on the platform by using the Intel Media SDK.



Software Architecture Overview

Figure 2 shows the general categories of Intel® Media SDK functions:  ENCODE, DECODE, and video pre-processing (VPP).   These functions work with coded bit streams and raw video frames.

The ENCODE class of functions compress raw video frames into a coded bitstream.  The DECODE class of functions decompress the bitstream into raw video frames, and the VPP class of functions work on raw video frames for pre-processing prior to encoding:

Figure 2: Intel® Media SDK Software Classes


Intel Media SDK Video Decode

The DECODE class of functions accept compressed raw video streams as input and coverts them into display ready frames as output.   This class of functions processes only raw bitstream’s and do not operate on bitstreams that reside in a container format such as MP4 or MPG.  The application must provide the facility to de-multiplex the bitstream prior to submitting for decode.

The DECODE class of functions have the following entry points:

DecodeHeader()

Parses the bitstream to retrieve the initial setup information for subsequent frames

Reset()

Resume after repositioning

DecodeFrameAsync()

Decodes a Frame

GetPayload()

Retrieves the user data

GetDecodeStat()

Obtain the decoding statistics


The decoding process requires the application provide the library with a sequence header (a sequence parameter set for H.264 data, or a sequence header in MPEG-2 and VC-1) that contains video configuration parameters to decode subsequent frames.  The DecodeHeader() function parses the bitstream’s sequence parameters and must be used prior to decoding the first frame.  In addition, the DECODE process supports the repositioning of the bitstream at any time during the decoding.  Again, a DecodeHeader() function should be used after the position change.

Applications can retrieve output video frames in two different ways:  in order of their decoding or their display.

  1. If the application retrieves frames in decoded order, DECODE returns a frame immediately upon decoding.  The application must then convert frames from their decoded order to their display order.
  2. If the application retrieves frames in display order, DECODE caches them internally until the next frame is available.

For more information regarding the DECODE class of functions available in the Intel® Media SDK, please refer to the Intel® Media SDK Reference Manual.


Intel Media SDK Video Encode

The ENCODE class of functions takes raw video frames as input and compresses them into a bitstream.  

ENCODE processes input frames in two ways:  in order of their display or their encoding.

  1. If the display order is chosen, ENCODE receives input frames in their order of display.  A few GOP (Group of Picture) structure parameters specify the GOP sequence during ENCODE initialization.
  2. In the encoded order mode, ENCODE receives input frames in their order of encoding.  The application must specify the exact input frame type for encoding.

ENCODE supports constant and variable bitrates.  In the constant bitrate mode, ENCODE performs stuffing when the size of the least compressed frame is smaller then what is required to meet the HRD buffer (or VBV) requirements.   Stuffing is a process to append zeros to the end of encoded frames.

Reset()

Fine-tune the encoding parameters

GetEncodeStat()

Retrieves the encoding statistics

EncodeFrameAsync()

Encode a frame with per-frame control




Intel Media SDK Video Preprocessing (VPP)

Video Preprocessing takes raw frames as input and converts their formats to raw frames as output.  The application specifies the input and output formats, and the SDK implementation configures the pipeline accordingly.  The actual conversion process is a chain operation with many small filters, as Figure 3 illustrates:

The Intel® Media SDK supports the following preprocessing functions:  color conversion, resizing, de-noising, de-interlacing, 3:2 pull down and scene detection.  The SDK dynamically configures the pipeline according to the user settings. By default, the pipeline is built in a way that best utilizes the hardware acceleration capabilities of the platform, or generates the best video quality.  The application can employ an extended buffer to configure preferences, or suggest certain operations when building the pipeline.


Reset()

Fine-tune the preprocessing  parameters

GetVPPStat()

Retrieves the preprocessing statistics

RunFrameVPPAsync()

Preprocess a frame



Using the Intel® Media SDK

Asynchronous Functions

The Intel® Media SDK uses asynchronous functions to perform encode, decode, and VPP functions.  Unlike normal “synchronous” functions which return their results immediately after completing, the Intel® Media SDK’s asynchronous functions require an extra step – synchronization – before the application can make use of the results.  Note the asynchronous functions have their names appended with “Async” to help distinguish them from other functions within the SDK:

  1. MFXVideoENCODE_EncodeFrameAsync for encoding
  2. MFXVideoDECODE_DecodeFrameAsync for decoding
  3. MFXVideoVPP_RunFrameVPPAsync  for video preprocessing.

An asynchronous function returns immediately without waiting for results.  The application must explicitly “synchronize” the asynchronous function results.  Without this important step, results of the asynchronous function are not available.   See “Asynchronous functions and Synchronization" in the Intel® Media SDK reference manual for more information.

Working with Microsoft* DirectX Applications

The Intel® Media SDK functions cooperate with Microsoft DirectX application through Direct3D9* surfaces.  The following illustrates the typical scenarios:

Figure 4: Intel® Media SDK Software and Microsoft* DirectX Compaibility

The SDK utilizes the Microsoft* DirectX Video Acceleration (DXVA2) for hardware acceleration. There are limiitations and requirements for the DXVA2 infrastructure that the application must observe:

  1. Initilization requires a Direct3D* device handle.  Applications may share devices by passing an Idirect3D9Manager interface handle to the Media SDK’s MFXVideoCORE_SetHandle() function.
  2. When SDK functions creat DXVA2 auxiliary devices for hardware acceleration, the SDK must allocate the list of Direct3D9 surfaces for I/O access – a surface chain – and pass that list as part of the device creation command.

The following table shows the supported Direct3D9 surface types and color formats:


















Figure 5: Intel® Media SDK Supported Surface Types



Pseudo Code Example

The following illustrates the simplicity of the Intel® Media SDK.

Figure 6: Intel® Media SDK Pseudo code example

The Intel® Media SDK is designed to reduce the amount of development time needed to leverage the video acceleration capabilities of the platform.  Of course, each application has its own requirements, but generally every application will employ a similar algorithm as above.



Developing Software with the Intel® Media SDK

The requirements for developing software applications with the Intel® Media SDK are as follows:

Developers must ensure the Intel® Media SDK is registered with Microsoft Visual C++ so that the include and lib folders are in the search path.   Otherwise you will need to add them manually in via :

Tools->Options->Projects and Solutions->VC++ Directories

Developers are also encouraged to link their projects with the static dispatching library libmfx.lib  to ensure the resulting application performs optimally on current and future Intel platforms.  The library code size is ~250Kb for both Win32 and x64 versions.  See the “Intel® Media SDK Architectural Overview” for more information on the dispatching layer.


Conclusion

The Intel® Media SDK provides a standardized way to leverage the video processing features of current and future Intel platforms by abstracting the platform complexities.   The following document described some of the API’s high level architectural features.   Developers are encouraged to explore the Intel Media® SDK Reference Manual to gain an in-depth understanding of these and additional topics.