Working with hardware acceleration

To fully utilize the SDK acceleration capability, the application should support OS specific infrastructures, Microsoft* DirectX* for Micorosoft* Windows* and VA API for Linux*. The exception is transcoding scenario where opaque memory type may be used. See Surface Type Neutral Transcoding for more details.

The hardware acceleration support in application consists of video memory support and acceleration device support.

Depending on usage model, the application can use video memory on different stages of pipeline. Three major scenarios are illustrated on Figure 5.

Figure 5 Usage of video memory for hardware acceleration

The application must use the IOPattern field of the mfxVideoParam structure to indicate the I/O access pattern during initialization. Subsequent SDK function calls must follow this access pattern. For example, if an SDK function operates on video memory surfaces at both input and output, the application must specify the access pattern IOPattern at initialization in MFX_IOPATTERN_IN_VIDEO_MEMORY for input and MFX_IOPATTERN_OUT_VIDEO_MEMORY for output. This particular I/O access pattern must not change inside the Init … Close sequence.

Initialization of any hardware accelerated SDK component requires the acceleration device handle. This handle is also used by SDK component to query HW capabilities. The application can share its device with the SDK by passing device handle through the MFXVideoCORE_SetHandle function. It is recommended to share the handle before any actual usage of the SDK.

 

The SDK supports two different infrastructures for hardware acceleration on Microsoft* Windows* OS, “Direct3D 9 DXVA2” and “Direct3D 11 Video API”. In the first one the application should use the IDirect3DDeviceManager9 interface as the acceleration device handle, in the second one - ID3D11Device interface. The application should share one of these interfaces with the SDK through the MFXVideoCORE_SetHandle function. If the application does not provide it, then the SDK creates its own internal acceleration device. This internal device could not be accessed by the application and as a result, the SDK input and output will be limited to system memory only. That in turn will reduce SDK performance. If the SDK fails to create a valid acceleration device, then SDK cannot proceed with hardware acceleration and returns an error status to the application.

The application must create the Direct3D9* device with the flag D3DCREATE_MULTITHREADED. Additionally the flag D3DCREATE_FPU_PRESERVE is recommended. This influences floating-point calculations, including PTS values.

The application must also set multithreading mode for Direct3D11* device. Example 7 Setting multithreading mode illustrates how to do it.

Text Box: ID3D11Device *pD11Device; ID3D11DeviceContext *pD11Context; ID3D10Multithread *pD10Multithread; pD11Device->GetImmediateContext(&pD11Context); pD11Context->QueryInterface(IID_ID3D10Multithread, &pD10Multithread); pD10Multithread->SetMultithreadProtected(true);

Example 7 Setting multithreading mode

During hardware acceleration, if a Direct3D* “device lost” event occurs, the SDK operation terminates with the return status MFX_ERR_DEVICE_LOST. If the application provided the Direct3D* device handle, the application must reset the Direct3D* device.

When the SDK decoder creates auxiliary devices for hardware acceleration, it must allocate the list of Direct3D* surfaces for I/O access, also known as the surface chain, and pass the surface chain as part of the device creation command. In most cases, the surface chain is the frame surface pool mentioned in the Frame Surface Locking section.

The application passes the surface chain to the SDK component Init function through an SDK external allocator callback. See the Memory Allocation and External Allocators section for details.

Only decoder Init function requests external surface chain from the application and uses it for auxiliary device creation. Encoder and VPP Init functions may only request internal surfaces. See the ExtMemFrameType enumerator for more details about different memory types.

Depending on configuration parameters, SDK requires different surface types. It is strongly recommended to call one of the MFXVideoENCODE_QueryIOSurf, MFXVideoDECODE_QueryIOSurf or MFXVideoVPP_QueryIOSurf functions to determine the appropriate type.

Table 6: Supported SDK Surface Types and Color Formats for Direct3D9 shows supported Direct3D9 surface types and color formats. Table 7: Supported SDK Surface Types and Color Formats for Direct3D11 shows Direct3D11 types and formats. Note, that NV12 is the major encoding and decoding color format. Additionally, JPEG/MJPEG decoder supports RGB32 and YUY2 output, JPEG/MJPEG encoder supports RGB32 and YUY2 input for Direct3D9/Direct3D11 and YV12 input for Direct3D9 only, and VPP supports RGB32 output.

Table 6: Supported SDK Surface Types and Color Formats for Direct3D9

SDK Class

SDK Function Input

SDK Function Output

Surface Type

Color Format

Surface Type

Color Format

DECODE

Not Applicable

Decoder Render Target

NV12

 

 

Decoder Render Target

RGB32, YUY2
JPEG only

VPP

Decoder/Processor Render Target

Listed in ColorFourCC

Decoder Render Target

NV12

 

 

 

Processor Render Target

RGB32

ENCODE

Decoder Render Target

NV12

Not Applicable

 

Decoder Render Target

RGB32, YUY2, YV12
JPEG only

 

Note: “Decoder Render Target” corresponds to DXVA2_ VideoDecoderRenderTarget type, “Processor Render Target” to DXVA2_ VideoProcessorRenderTarget.

 

Table 7: Supported SDK Surface Types and Color Formats for Direct3D11

SDK Class

SDK Function Input

SDK Function Output

Surface Type

Color Format

Surface Type

Color Format

DECODE

Not Applicable

Decoder Render Target

NV12

 

 

Decoder /Processor Render Target

RGB32, YUY2
JPEG only

VPP

Decoder/Processor Render Target

Listed in ColorFourCC

Processor Render Target

NV12

 

 

 

Processor Render Target

RGB32

ENCODE

Decoder/Processor Render Target

NV12

Not Applicable

 

Decoder /Processor Render Target

RGB32, YUY2
JPEG only

 

Note: “Decoder Render Target” corresponds to D3D11_BIND_DECODER flag, “Processor Render Target” to D3D11_BIND_RENDER_TARGET.

 

The SDK supports single infrastructure for hardware acceleration on Linux* - “VA API”. The application should use the VADisplay interface as the acceleration device handle for this infrastructure and share it with the SDK through the MFXVideoCORE_SetHandle function. Because the SDK does not create internal acceleration device on Linux, the application must always share it with the SDK. This sharing should be done before any actual usage of the SDK, including capability query and component initialization. If the application fails to share the device, the SDK operation will fail.

Example 8 Obtaining VA display from X Window System and Example 9 Obtaining VA display from Direct Rendering Manager show how to obtain and share VA display with the SDK.

Text Box: Display *x11_display; VADisplay va_display; x11_display = XOpenDisplay(current_display); va_display = vaGetDisplay(x11_display); MFXVideoCORE_SetHandle(session, MFX_HANDLE_VA_DISPLAY, (mfxHDL) va_display);

Example 8 Obtaining VA display from X Window System

Text Box: int card; VADisplay va_display; card = open("/dev/dri/card0", O_RDWR); /* primary card */ va_display = vaGetDisplayDRM(card); vaInitialize(va_display, &major_version, &minor_version); MFXVideoCORE_SetHandle(session, MFX_HANDLE_VA_DISPLAY, (mfxHDL) va_display);

Example 9 Obtaining VA display from Direct Rendering Manager

When the SDK decoder creates hardware acceleration device, it must allocate the list of video memory surfaces for I/O access, also known as the surface chain, and pass the surface chain as part of the device creation command. The application passes the surface chain to the SDK component Init function through an SDK external allocator callback. See the Memory Allocation and External Allocators section for details.

Only decoder Init function requests external surface chain from the application and uses it for device creation. Encoder and VPP Init functions may only request internal surfaces. See the ExtMemFrameType enumerator for more details about different memory types.

The VA API does not define any surface types and the application can use either MFX_MEMTYPE_VIDEO_MEMORY_DECODER_TARGET or MFX_MEMTYPE_VIDEO_MEMORY_PROCESSOR_TARGET to indicate data in video memory.

Table 8: Supported SDK Surface Types and Color Formats for VA API shows supported by VA API color formats.

Table 8: Supported SDK Surface Types and Color Formats for VA API

SDK Class

SDK Function Input

SDK Function Output

Color Format

Color Format

DECODE

Not Applicable

NV12

 

 

RGB32, YUY2
JPEG only

VPP

Listed in ColorFourCC

NV12, RGB32

ENCODE

NV12

Not Applicable

 

 

RGB32, YUY2, YV12
JPEG only

 

 

 

For more complete information about compiler optimizations, see our Optimization Notice.