Advanced Bitrate Control Methods in Intel® Media SDK

Introduction

In the world of media, there is a great demand to increase encoder quality but this comes with tradeoffs between quality and bandwidth consumption. This article addresses some of those concerns by discussing advanced bitrate control methods, which provide the ability to increase quality (relative to legacy rate controls) while maintaining the bitrate constant using Intel® Media SDK/ Intel® Media Server Studio tools.

The Intel Media SDK encoder offers many bitrate control methods, which can be divided into legacy and advanced/special purpose algorithms. This article is the 2nd part of 2-part series of Bitrate Control Methods in  Intel® Media SDK. The legacy rate control algorithms are detailed in the 1st part, which is Bitrate Control Methods (BRC) in Intel® Media SDK; the advanced rate control methods (summarized in the table below) will be explained in this article.

Rate Control

HRD/VBV Compliant

OS supported

Usage

LA

No

Windows/Linux

Storage transcodes

LA_HRD

Yes

Windows/Linux

Storage transcodes; Streaming solution (where low latency is not a requirement)

ICQ

No

Windows

Storage transcodes (better quality with smaller file size)

LA_ICQ

No

Windows

Storage transcodes

Following tools (along with their downloadable links) are used to explain the concepts and generate performance data for this article: 

Look Ahead (LA) Rate Control

As the name explains, this bitrate control method looks at successive frames, or the frames to be encoded next, and stores them in a look-ahead buffer. The number of frames or the length of the look ahead buffer can be specified by the LookAheadDepth parameter. This rate control is recommended for transcoding/encoding in a storage solution.

Generally, many parameters can be used to modify the quality/performance of the encoded stream.  In this particular rate control, the encoding performance can be varied by changing the size of the look ahead buffer. The LookAheadDepth parameter value can be changed between 10 - 100 to specify the size of the look ahead buffer. The LookAheadDepth parameter specifies the number of frames that the SDK encoder analyzes before encoding. As the LookAheadDepth increases, so does the number of frames that the encoder looks into; this results in an increase in quality of the encoded stream, however the performance (encoding frames per second) will decrease. In our experiments, this performance tradeoff is negligible for smaller input streams.

Look Ahead rate control is enabled by default in sample_encode and sample_multi_transcode, part of code samples. The example below describes how to use this rate control method using the sample_encode application.

sample_encode.exe h264 -i sintel_1080p.yuv -o LA_out.264 -w 1920 -h 1080 -b 10000 –f 30 -lad 100 -la

As the value of LookAheadDepth increases, encoding quality improves, because the number of frames stored in the look ahead buffer have also increased, and the encoder will have more visibility to upcoming frames.

It should be noted that LA is not HRD (Hypothetical Reference Decoder) compliant. The following picture, obtained from Intel® Video Pro Analyzer shows a HRD buffer fullness view with “Buffer” mode enabled where sub-mode “HRD” is greyed out. This means no HRD parameters were passed in the stream headers, which indicates LA rate control is not HRD compliant. 

LA BRC
Figure 1: Snapshot of Intel Video Pro Analyzer analyzing H264 stream(Sintel -1080p), encoded using LA rate control method. Left axis of the plot shows frame sizes and the right axis of the plot shows the slice QP (Quantization Parameter) values.

 

Sliding Window condition:

Sliding window algorithm is a part of the Look Ahead rate control method. This algorithm is applicable for both LA and LA_HRD rate control methods by defining WinBRCMaxAvgKbps and WinBRCSize through the mfxExtCodingOption3 structure.

Sliding window condition is introduced to strictly constrain the maximum bitrate of the encoder by changing two parameters: WinBRCSize and WinBRCMaxAvgKbps. This helps in limiting the achieved bitrate which makes it a good fit in limited bandwidth scenarios such as live streaming.

  • WinBRCSize parameter specifies the sliding window size in frames. A setting of zero means that sliding window condition is disabled.
  • WinBRCMaxAvgKbps specifies the maximum bitrate averaged over a sliding window specified by WinBRCSize.

In this technique, the average bitrate in a sliding window of WinBRCSize must not exceed WinBRCMaxAvgKbps. The above condition becomes weaker as the sliding window size increases and becomes stronger if the sliding window size value decreases. Whenever this condition fails, the frame will be automatically re-encoded with a higher quantization parameter and performance of the encoder decreases as we keep encountering failures. To reduce the number of failures and to avoid re-encoding, frames within the look ahead buffer will be analyzed by the encoder. A peak will be detected when there is a condition failure by encountering a large frame in the look ahead buffer. Whenever a peak is predicted, the quantization parameter value will be increased, thus reducing the frame size.

Sliding window can be implemented by adding the following code to the pipeline_encode.cpp program in the sample_encode application.

m_CodingOption3.WinBRCMaxAvgKbps = 1.5*TargetKbps;
m_CodingOption3.WinBRCSize = 90; //3*framerate
m_EncExtParams.push_back((mfxExtBuffer *)&m_CodingOption3);

The above values were chosen when encoding sintel_1080p.yuv of 1253 frames with H.264 codec, TargetKbps = 10000, framerate = 30fps. Sliding window parameter values (WinBRCMaxAvgKbps and WinBRCSize) are subject to change when using different input options.

If WinBRCMaxAvgKbps is close to TargetKbps and WinBRCSize almost equals 1, the sliding window will degenerate into the limitation of the maximum frame size (TargetKbps/framerate).

Sliding window condition can be evaluated by checking in any WinBRCSize consecutive frames, the total encoded size doesn't exceed the value set by WinBRCMaxAvgKbps. The following equation explains the sliding window condition.

The condition of limiting frame size can be checked after the asynchronous encoder run and encoded data is written back to the output file in pipeline_encode.cpp.

Look Ahead with HRD Compliance (LA_HRD) Rate Control

As Look Ahead bitrate control is not HRD compliant, there is a dedicated mode to achieve HRD compliance with the LookAhead algorithm, known as LA_HRD mode (MFX_RATECONTROL_LA_HRD). With HRD compliance, the Coded Picture Buffer should neither overflow nor underflow. This rate control is recommended in storage transcoding solutions and streaming scenarios, where low latency is not a major requirement.

To use this rate control in sample_encode, it will require code changes as illustrated below -

Statements to be added in sample_encode.cpp file within ParseInputString() function

else if (0 == msdk_strcmp(strInput[i], MSDK_STRING("-hrd")))
pParams->nRateControlMethod = MFX_RATECONTROL_LA_HRD;

LookAheadDepth value can be mentioned in the command line when executing the sample_encode binary. The example below describes how to use this rate control method using the sample_encode application.

sample_encode.exe h264 -i sintel_1080p.yuv -o LA_out.264 -w 1920 -h 1080 -b 10000 –f 30 -lad 100 –hrd

In the following graph, the LookAheadDepth(lad) value is 100.                                                                                          

Look Ahead HRD

Figure 2: a snapshot of Intel® Video Pro Analyzer(VPA), which verifies that LA_HRD rate control is HRD compliant. The buffer fullness mode is activated by selecting “Buffer” mode and “HRD” is chosen in sub-mode.

The above figure shows HRD buffer fullness view with “Buffer” mode enabled in Intel VPA, in which the sub-mode “HRD” is selected. The horizontal red lines show the upper and lower limits of the buffer and green line shows the instantaneous buffer fullness. The buffer fullness didn’t cross the upper and lower limits of the buffer. This means neither overflow nor underflow occurred in this rate control.

Extended Look Ahead (LA_EXT) Rate Control

For 1:N transcoding scenarios (1 decode and N encode session), there is an optimized lookahead algorithm knows as Extended Look Ahead Rate Control algorithm (MFX_RATECONTROL_LA_EXT), available only in Intel® Media Server Studio (not part of the Intel® Media SDK). This is recommended for broadcasting solutions.

An application should be able to load the plugin ‘mfxplugin64_h264la_hw.dll’ to support MFX_RATECONTROL_LA_EXT. This plugin can be found in the following location in the local system, where the Intel® Media Server Studio is installed.

  • “\Program Installed\Software Development Kit\bin\x64\588f1185d47b42968dea377bb5d0dcb4”.

The path of this plugin needs to be mentioned explicitly because it is not part of the standard installation directory. This capability can be used in either of two ways:

  1. Preferred Method - Register the plugin with registry and point all necessary attributes such as API version, plugin type, path etc; so the dispatcher, which is a part of the software, can find it through the registry and connect to a decoding/encoding session.
  2. Have all binaries (Media SDK, plugin, and app) in a directory and execute from the same directory.

LookAheadDepth parameter must be mentioned only once and considered to be the same value of LookAheadDepth of all N transcoded streams. LA_EXT rate control can be implemented using sample_multi_transcode, below is the example cmd line - 

sample_multi_transcode.exe -par file_1.par

Contents of the par file are

-lad 40 -i::h264 input.264 -join -la_ext -hw_d3d11 -async 1 -n 300 -o::sink
-h 1088 -w 1920 -o::h264 output_1.0.h264 -b 3000 -join -async 1 -hw_d3d11 -i::source -l 1 -u 1 -n 300
-h 1088 -w 1920 -o::h264 output_2.h264 -b 5000 -join -async 1 -hw_d3d11 -i::source -l 1 -u 1 -n 300
-h 1088 -w 1920 -o::h264 output_3.h264 -b 7000 -join -async 1 -hw_d3d11 -i::source -l 1 -u 1 -n 300
-h 1088 -w 1920 -o::h264 output_4.h264 -b 10000 -join -async 1 -hw_d3d11 -i::source -l 1 -u 1 -n 300

Intelligent Constant Quality (ICQ) Rate Control

The ICQ bitrate control algorithm is designed to improve subjective video quality of an encoded stream: it may or may not improve video quality objectively - depending on the content. ICQQuality is a control parameter which defines the quality factor for this method. ICQQuality parameter can be changed between 1 - 51, where 1 corresponds to the best quality. The achieved bitrate and encoder quality (PSNR) can be adjusted by increasing or decreasing ICQQuality parameter. This rate control is recommended for storage solutions, where high quality is required while maintaining a smaller file size.

To use this rate control in sample_encode, it will require code changes as explained below - 

Statements to be added in sample_encode.cpp within ParseInputString() function

else if (0 == msdk_strcmp(strInput[i], MSDK_STRING("-icq")))
pParams->nRateControlMethod = MFX_RATECONTROL_ICQ;

ICQQuality is available in the mfxInfoMFX structure. The desired value can be entered for this variable in InitMfxEncParams() function, e.g.: 

m_mfxEncParams.mfx.ICQQuality = 12;

The example below describes how to use this rate control method using the sample_encode application.

sample_encode.exe h264 -i sintel_1080p.yuv -o ICQ_out.264 -w 1920 -h 1080 -b 10000 -icq
VBR vs ICQ RD Graph
Figure 3: Using Intel Media SDK samples and Video Quality Caliper, compare VBR and ICQ (ICQQuality varied between 13 and 18) with H264 encoding for 1080p, 30fps sintel.yuv of 1253 frames

Using about the same bitrate, ICQ shows improved Peak Signal to Noise Ratio (PSNR) in the above plot. The RD-graph data for  the above plot is captured using the Video Quality Caliper, which compares two different streams encoded with ICQ and VBR.

Observation from above performance data:

  • At the same achieved bitrate, ICQ shows much improved quality (PSNR) compared to VBR, while maintaining the same encoding FPS.
  • The encoding bitrate and quality of the stream decreases as the ICQQuality parameter value increases.

The snapshot below shows a subjective comparison between encoded frames using VBR (on the left) and ICQ (on the right). Highlighted sections demonstrate missing details in VBR and improvements in ICQ.

VBR and ICQ subjective comparison
Figure 4: Using Video Quality Caliper, compare encoded frames subjectively for VBR vs ICQ

 

Look Ahead & Intelligent Constant Quality (LA_ICQ) Rate Control

This method is the combination of ICQ with Look Ahead.  This rate control is also recommended for storage solutions. ICQQuality and LookAheadDepth are the two control parameters where the qualify factor is specified by mfxInfoMFX::ICQQuality and look ahead depth is controlled by the  mfxExtCodingOption2: LookAheadDepth parameter.

To use this rate control in sample_encode, it requires code changes as explained below - 

Statements to be added in sample_encode.cpp within ParseInputString() function

else if (0 == msdk_strcmp(strInput[i], MSDK_STRING("-laicq")))
pParams->nRateControlMethod = MFX_RATECONTROL_LA_ICQ;

ICQQuality is available in the mfxInfoMFX structure. Desired values can be entered for this variable in InitMfxEncParams() function

m_mfxEncParams.mfx.ICQQuality = 12;

LookAheadDepth can be mentioned in command line as lad.

sample_encode.exe h264 -i sintel_1080p.yuv -o LAICQ_out.264 -w 1920 -h 1080 -b 10000 –laicq -lad 100
VBR vs LAICQ RD-graph
Figure 5: Using Intel Media SDK samples and Video Quality Caliper, compare VBR and LA_ICQ (LookAheadDepth 100, ICQQuality varied between 20 and 26) with H264 encoding for 1080p, 30fps sintel.yuv of 1253 frames

At similar bitrate, better PSNR is observed for LA_ICQ compared to VBR as shown in the above plot. By keeping LookAheadDepth value at 100, the ICQQuality parameter value was changed between 1 - 51. The RD-graph data for this plot was captured using the Video Quality Caliper, which compares two different streams encoded with LA_ICQ and VBR.

Conclusion

There are several advanced bitrate control methods available to play with, to see if higher quality encoded streams can be achieved while maintaining bandwidth requirements constant.  Each rate control has its own advantages and can be used in specific industry level use-cases depending on the requirement. This article focuses on H264/AVC encoder rate control methods, and might not be applicable to MPEG2 and H265/HEVC encoder. To implement these bitrate controls, also refer to the Intel® Media SDK Reference Manual, which comes with an installation of the Intel® Media SDK or Intel® Media Server Studio, and the Intel® Media Developer’s Guide from the documentation website. Visit Intel’s media support forum for further questions.

Resources

For more complete information about compiler optimizations, see our Optimization Notice.