The Intel Media SDK provides many ways to set up bitrate control to match your needs.
Intel encoders include many types of bitrate control, each designed for different types of use cases. There is no general purpose choice which is optimal across all scenarios. Choose the mode that best matches your requirements(for AVC codec):
|CQP||No||Yes||Yes||Any usage; can implement own BRC or have macro block level control.|
|CBR||Yes (overflow, underflow)||Yes||Yes||Video conference, Video surveillance solutions|
|VBR||Yes(underflow)||Yes||Yes||Video survailance storage, Live streaming, Broadcast solutions|
|LA_HRD||Yes||Yes||Yes||Storage transcodes; Streaming(where low latency is not a requirement)|
|VCM||Yes||Yes||No||Video conferencing solution|
Unlike uncompressed video, frame sizes of encoded bitstreams are expected to vary widely. IDR and I frames may be an order of magnitude or more larger than accompanying P and B frames. This is one of the core ideas of video compression: P and B frames store differences and don't encode entire pictures. B frames tend to be smaller than P frames. Content differences/scene changes can cause further variations. Also, there are hierarchical B frames which provide good effective compression efficiency and limit the propagation of errors, these frames are more close to P frames in terms of size point of view.
The Hypothetical Reference Decoder (HRD), previously known as Video Buffering Verifier(VBV) from the mpeg2 era, is a "leaky bucket" model which, if the bitstream conforms, increases the likelihood that any conforming decoder will be able to receive the stream as intended without delays or dropping frames. This is especially necessary if the target audience for the encoded stream includes low cost hardware decoders without the additional buffering + error recovery available to most processors used for general purpose computers.
HRD is a simplified model, assuming that data is arriving according to individual frame size but exiting at the given bitrate. The goal is to keep the buffer, or "bucket" from underflow or overflow.
For HW decoders with small buffers overflows and underflows can mean dropped frames and other poor end user experience. However, maintaining HRD conformance can be expensive. For example, in some cases frames must be re-encoded. It also increases BRC algorithm complexity and performance impact. So it is best to understand when it is needed and when it is not.
Constant QP (CQP) provides the most control and best performance. Without question, the best coding efficiency with Intel codecs can be obtained via CQP plus custom content analysis. CQP often has significant performance advantages as well. CQP operates most closely to reference implementations. It is the most direct way to access codec capabilities and measure the effects of encoder parameter/algorithm trade-offs and also is the clearest way to evaluate against other codec algorithm implementations.
Also without question: making this work well cannot be done without extra development. Simply setting one global QP for all frames without content analysis/dynamic adjustments is quite likely to have worse bitrate vs quality than other BRC alternatives. The main drawback for CQP is that it leaves the selection of quantizers entirely up to the application. As there is no simple or linear mapping between QP and either perceived quality or resulting bitrate, one must build a feedback system that analyzes content and adapts to the result of each frame’s size.
However, CQP mode unlocks the most sophisticated types of content analysis, which provides a broad area for innovation leveraging computer vision and perceptually-sensitive content analysis, both across frames and using per-MB QP adjustments.
The lack of analysis means CQP is generally the fastest (lowest complexity) BRC method, although the additional work required at the application level will generally outweigh this. As CQP is best left to expert users, the remainder of this article will focus on “automated” BRC algorithms.
The standard/legacy algorithms from the beginning of audio and video compression are CBR and VBR:
Constant Bitrate (CBR)
In this method, the encoder performs padding (adding zeros to the end of encoded frames) when bitstream frame size is smaller than what is required to meet the HRD requirements. CBR "optimizes" for constant flow of data, which means a significant portion of overall bitrate is wasted on padding instead of being used to encode frame details. The goal of CBR is to provide constant flow of data, which mostly result in to unused capacity. This method is available for historical/legacy reasons. This is the default BRC method used in many Media SDK samples.
BufferSizeInKB is an important parameter for CBR. Smaller BufferSizeInKB means smaller frame size variations. At very small buffer sizes it becomes increasingly difficult to maintain HRD, but this is the best method to reduce CBR spikes. Larger buffers mean more variations can be allowed to preserve quality where needed. One such example was done using tutorials to change buffersize inKB to see the variation over instantaneous bitrate.
Here bitrate in used in Kbps to show the variation over the period of frames being encoded. The least buffer size which can encode at 4000kbps(target bitrate) is 34KB, below this encoder throws an error. The variation over the instantaneous bit rate is seen from 34KB to 1000KB, after that variation starts getting saturated. The least variation is seen with buffer size is equal to 200KB. However, if large variations can be tolerated then other BRC modes may be more appropriate. This value will be different for different inputs and also would be different for encoding at different bitrates.
Variable Bitrate (VBR)
This method allows more bitrate variations to match complexity of the inputs i.e. assign higher bit rate to complex scenarios like scene change and low bitrate to less complex scenarios. It tries to achieve a smaller overall file size, but also gives unpredictable spikes. This method provides an overall better bitrate quality by allowing bitrate to fluctuate more and also by disable padding. This is a default method used in Media SDK tutorials. However, for the subset of cases where low latency is required and a target bitrate must be maintained VBR should be considered.
Used Media SDK tutorials to run a test comparing VBR and CBR.
Lookahead is an extension of VBR, representing a significant advance in automatic BRC for Intel codecs. It can provide large quality improvements, sometimes larger than a more expensive target usage, at roughly the same FPS/channel capacity as CBR/VBR. It should be considered the default choice for many file-to-file transcode scenarios.
Lookahead provides an answer to key VBR/CBR challenges: causes of large frame size variation like I-frames and scene changes can be anticipated. A buffer of frames (length configurable with lookahead depth parameter) is analyzed for potential bitrate disruptions. This allows a "preview" or "lookahead" with many of the advantages of two-pass encoding. In effect, it is VBR with advance notice, higher variability, and longer latency.
It works by performing extensive analysis of several dozen frames including complexity, relative motion, and dependencies before the actual encoding. It distributes available bit budget between frames to produce the best possible encoding quality. It generates good results on fast motion video and computer animation. Also, improves both objective metrics like SSIM, PSNR and subjective video quality. It works with any GOP pattern but presence of B frames provides the best quality gain. One side effect is that it significantly increases encoding delay and memory consumption. Due to this increased latency it may not be the best choice for game streaming.
LookAheadDepth, a parameter which specifies the depth of look ahead rate control algorithm. It is the number of frames analyzed before encoding. Valid value range is from 10 to 100. To instruct the SDK encoder to use the default value the application should zero this field. The only available rate control parameter in this mode is mfxInfoMFX::TargetKbps. Two other parameters, MaxKbps and InitialDelayInKB, are ignored. Media SDK must be initialized to use API version 1.7 or newer to use Media SDK Method.
Lookahead is only available on Intel® Iris™ Pro Graphics, Intel® Iris™ Graphics and Intel® HD Graphics on Haswell architecture (4th Generation Core) and forward. Both Interlaced and progressive content is supported.
ICQ/LA_ICQ: These algorithms focus on maintaining a target quality level (roughly equivalent to CBR 1-51 quantization). They allow even larger variation than CBR/VBR/lookahead.
QVBR : also sets a target quality level, QVBRQuality parameter through mfxCodingOption3 and a target bitrate for scenarios where variability may need to be limited, such as game/display streaming scenarios. This algorithm tries to achieve the subjective quality with minimum no. of bits while trying to keep the bitrate constant and HRD compliance is being followed. QVBR is supported from 4th generation Intel® Core processor(codename Haswell) onward.
VCM: This is a specialized videoconferencing mode. In some videoconferencing scenarios it can provide higher quality at lower bitrate.
Note : For AVC codec, Look Ahead (LA), LA_ICQ, LA_HRD, VCM, QVBR are not available in software implementation. For latest update and details please check release notes.
BRCParamMultiplier, a parameter which specifies a multiplier for bitrate control parameters. It is useful to achieve higher bit rate. This multiplier multiplies with the TargetKbps(which is mfxU16) to achieve quite high bitrate(validated upto 200mbps). Please keep in mind this multiplier will also affect BufferSizeInKB, IntialDelayInKB and MaxKbps. The table below shows the scenario when this parameter is used.
I have used simple_encode tutorial(taken from the tutorials on Media Solution Portal) to encode a 720p - Park Joy uncompressed video to achieve higher bitrate with BRCParamMultiplier and without the BRCParamMultiplier and here are the results -
This table shows that after 60Mbps, the conventional method does not achieve the higher bitrates. In these scenarios, you are advised to use the BRCParamMultipler to go beyond 60Mbps, and also make sure to modify the the Maxlength and BufferSizeInKB accordingly. The BRCParamMultiplier was introduced in Media SDK API 1.3 version.
For more information about the Bitrate Control methods, you can read the Media SDK Manual which comes with an Installation and Developers Guide from the documentation site and ask any question on our media forum.