The VEBox is an independent fixed-function silicon block within Intel GPU hardware that provides a variety of image enhancement stages. This block is completely independent from the GPGPU pipeline and is thus able to execute concurrently with regular OpenCL kernels. Execution of the VEBox does not impact EU performance. See here for more information on preview extensions, how features are enabled and how to provide feedback.
The VEBox is not a programmable block, it is a fixed function unit that defines an image processing pipeline for algorithms that are common in video processing workloads. Each stage in the pipeline is implemented in gates to maximize workload performance with a low power footprint. The downside is that each stage of the pipeline is rigidly defined to perform a set algorithm, which may be configured in a number of ways, but cannot be fundamentally altered. If workloads can make use of the operations provided by VEBox there may be considerable performance advantages. Key among these is the ability to do computations without distorting either graphics or GPGPU performance. The VEBox shares memory resources with the rest of the HD Graphics part and the transition of OpenCL memory object between EU computation and VEBox computation is minimal.
The OpenCL extensions for VEBox define a hardware-only interface. This means that video enhancement features are only present when supported by physical hardware. The OpenCL API may be used to determine if there is VEBox present on the local machine and which features are supported.
The presence/capabilities of VEBox hardware depends on particular combinations of hardware and driver versions. In general, at the publication time of this article these features are available for processors with Gen9 GPUs (6th/7th Generation Core processors) in Linux.
Short answer, a lot. It has built in features for processing video (e.g. deinterlace), working with raw camera data ( e.g. demosaic) and a suite of common image processing operations (e.g. color space conversion, color correction, contrast enhancement, etc.). The VEBox pipeline on Skylake is illustrated below:
Whenever work is enqueued to the VEBox the entire pipeline is invoked. Invocation always happens with a set of input and output data (images) and an opaque pipeline state configuration. A typical usage is to invoke the pipeline with an input image, an output image, and an accelerator object (more about this later). The pipeline is broken down into roughly three sub-pipelines. They are the Camera pipeline, the Denoise and Deinterlace (DN/DI) pipeline and the Image Enhancement and Color Processing (IECP) pipeline. Execution of the VEBox will start from the top of one of these three stages. For example, if the programmer is working with RAW sensor data, they can invoke the pipeline starting with the Camera pipe. Similarly, if a programmer wants to convert interlaced video frames to progressive frames they can invoke the pipeline from the DN/DI stage. Generally speaking, any enabled stage down-stream from where execution begins will be applied in a single invocation. For example, the programmer may invoke VEBox with RAW camera data, demosaic (i.e. convert it to 4:4:4 color) then convert it into RGB in the Color Space Conversion stage (downstream in the IECP pipeline). The performance impact on having lots of stages enabled versus enabling just a few is typically negligible. Here is a brief description of each stage:
|Black Level Correction||Adjusts the black level|
Reduces lens color distortion
|White Balance Correction||Applies white balance correction|
|Hot Pixel Correction||Reduces salt-and-pepper noise and other artifacts|
|Denoise||Adaptive noise reduction for improved quality|
|Deinterlace||Converts from interlaced to progressive|
|Demosaic||Converts Raw Bayer patterns to YUV color|
|Color Correction Matrix||Applies color correction|
|Forward Gamma Correction||Applies gamma correction|
|Front-End Color Space Conversion||Converts the colors space to YUV for later processing|
|Skin-Tone Detection and Enhancement||Improves the visual quality of skin-toned pixels|
|Gamut Compression||Reduces color gamut in a way that minimizes distortion|
|Adaptive Contrast Enhancement||Adaptively boosts contrast improving over all image quality|
|Total Color Correction||Modifies the colors based on key RGBYMC values|
|Process Amplifier||Controls hue, saturation, brightness and contrast|
|Back-end Color Space Conversion||Converts pixel in the pipeline to a desired format|
|Gamut Expansion / Color Correction||Expands color to a wider gamut and other corrections|
Most stages of the VEBox pipeline are expressed with minimal abstractions, allowing the user to make use of hardware without making any assumptions about the users workload.
In a few cases, certain stages may be configured by the driver (e.g. color space convert) to reduce the implementation complexity of common use cases. However, the programmer can override any of these default configurations by providing an explicit configuration.
The OpenCL VEBox extensions expose low-level interfaces to the VEBox. There are three extension interfaces that roughly correspond to the three sub-pipelines described above. While these three interfaces culminate into a single pipeline, they are kept distinct based on a number of forward and backward compatibility considerations. These extension are written in a form consistent with other OpenCL extensions.
Starting point to obtain VEBox built-in kernels, command queues and accelerator objects.
More info: Spec
Samples: Minimal VEBox samples
Extends cl_intelx_video_enhancement with the IECP pipeline and describes interface for accessing statistical information used in adaptive filters.
More info: Spec
Extends cl_intelx_video_enhancement with the camera pipe, enabling workloads that operate on raw sensor data.
More info: Spec
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice revision #20110804