Note The conservative morphological anti-aliasing 2.0 (CMAA2) algorithm is a significant update of the original algorithm presented here.
This article was taken from a blog posting on Intel® Developer Zone (Intel® DZ) by Leigh Davies at Intel Corporation, highlighting work and results completed by Leigh and his colleague Filip Strugar in the new AA technique being referred to as Conservative Morphological Anti-Aliasing. Below is the content of the blog along with the available code project download for your examination.
This sample presents a new, image-based, post-processing antialiasing technique referred to as Conservative Morphological Anti-Aliasing and can be downloaded here. The technique was originally developed by Filip Strugar at Intel for use in GRID2 by Codemasters*, to offer a high performance alternative to traditional multi sample anti-aliasing (MSAA) while addressing artistic concerns with existing post-processing antialiasing techniques. The sample allows CMAA to be compared with several popular post processing techniques together with hardware MSAA in a real time rendered scene as well as to an existing image. The scene is rendered using a simple HDR technique and includes basic animation to allow the user to compare how the different techniques cope with temporal artifacts in addition to static portions of the image.
Figure 1. CMAA Sample using HDR and animating geometry
MSAA has long been used to reduce aliasing in computer games and significantly improve their visual appearance. Basic MSAA works by running the pixel shader once per pixel but running the coverage and occlusion tests at higher than normal resolution, typically 2x through 8x, and then merging the results together. While significantly faster than super sampling it still represents a significant additional cost compared to no anti-aliasing and is difficult to implement with certain techniques – for example, this sample uses a custom fullscreen pass needed to get correct post HDR tone-mapping MSAA resolve6.
An alternative to MSAA is to use an image-based post-process anti-aliasing (PPAA), which became practical with GPU ports of Morphological antialiasing (MLAA)1 and further developments such as “Enhanced Subpixel Morphological Antialiasing”(SMAA)2 and NVidia’s “Fast approximate anti-aliasing” (FXAA)3. Compared to MSAA, these PPAA techniques are easy to implement and work in scenarios where MSAA does not (such as deferred lighting and other non-geometry based aliasing), but lack adequate sub-pixel accuracy and are less temporally stable. They also cause perceptible blurring of textures and text, since it is difficult for edge-detection algorithms to distinguish between intentional colour discontinuities and unwanted aliasing caused by imperfect rendering.
Currently two of the most popular PPAA algorithms are:
Is an algorithm based on MLAA but with a number of innovations and improvements, and with a number of quality/performance presets. It implements advanced pattern recognition and local contrast adaptation, and the more expensive variations use temporal super-sampling to reduce temporal instability and improve quality. The SMAA algorithm version referenced in this document is the latest public code v2.7.
Is a much faster effect. However, FXAA has simpler colour discontinuity shape detection, causing substantial (frequently unwanted) image blurring. It also has fairly limited kernel size by default, so it doesn't sufficiently anti-alias longer edge shapes, while increasing the kernel size impacts performance significantly. FXAA algorithm version referenced in this document is v3.8 unless otherwise specified (newest v3.11 was added to the sample in the last minute, in addition to 3.8).
In this sample we introduce a new technique called Conservative Morphological Anti-Aliasing (CMAA). CMAA addresses two requirements that are currently not addressed by existing techniques:
CMAA is positioned between FXAA and SMAA 1x in computation cost (0.9-1.2x the cost of default FXAA 3.11 and 0.45-0.75x the cost of SMAA 1x) on 4th generation Intel® HD Graphics hardware and above. Compared to FXAA, CMAA provides significantly better image quality and temporal stability as it correctly handles edge lines up to 64 pixels long and is based on an algorithm that only handles symmetrical discontinuities in order to avoid unwanted blurring (thus being more conservative). When compared to SMAA 1x it will provide less anti-aliasing as it handles fewer shape types but also causes less blurring, shape distortion, and has more temporal stability (is less affected by small frame-to-frame image changes).
CMAA has four basic logical steps (not necessarily matching the order in the implementation):
Edge detection is performed by comparing neighboring colours using:
An edge (discontinuity) exists if the difference of neighboring pixel values is above a preset threshold (which is determined empirically).
Figure 2. Showing results of a default edge detection algorithm.
This step serves a similar function to “local contrast adaptation” in SMAA and “local contrast test” in FXAA but with a smaller kernel. For each edge detected in Step 1, colour delta value above threshold (dEc) is compared to that of neighboring 12 edges (dEn):
Figure 3. Neighboring edges considered for local contrast adaptation.
This smaller local adaptation kernel size is somewhat less efficient at increasing effective edge detection range. However, it is more effective at preventing blurring of small shapes (such as text), reducing local shape interference from less noticeable edges, avoiding some of the pitfalls of large kernels (visible kernel-sized transition from un-blurred to blurry), and has better performance.
Edges detected in step 1, and refined in step 2, are used to make assumptions about the shape of the underlying edge before rasterization (virtual shape). For simple shape handling, all pixels are analyzed for existence of 2, 3 and 4 edge aliasing shapes, and colour transfer is applied to match the virtual shape colour coverage and achieve the local anti-aliasing effect (Figure 4). While this colour transfer is not always symmetrical, the amount of shape distortion is minimized to sub-pixel size.
Figure 4. 2-edge, 3-edge and 4-edge shapes; reconstructed virtual shape shown in yellow; black arrows showing anti-aliasing colour blending direction
4a. Shape rasterization pixel step (which is mostly a triangle edge). Criterion used for this detection is illustrated in Figure 5. Four Z-shape orientations (with 90° difference) are handled.
Figure 5. Z-shape detection criterion is true if edges illustrated blue are present while red ones are not; green arrows show subsequent edge tracing.
4b. For each detected Z-shape, the length of the edge to the left/right is determined by tracing the horizontal (for two horizontal Z-shapes) edges on both sides, and stopping if none are present on either side, or a vertical edge is encountered (Figure 6).
4c. The edge length from the previous step is used to reconstruct the location of the virtual shape edge and apply colour transfer (to both sides of the Z-shape) to match coverage that it would have at each pixel. This step overrides any anti-aliasing done in Step 3 on the same pixels.
Figure 6. Long edge (Z-shapes): edge length tracing marked blue, with Z shape at center; reconstructed virtual shape shown in yellow; black arrows showing anti-aliasing colour blending direction.
The inherent symmetry of this approach better preserves overall image average colour and shape, ignores borderline cases, and better preserves original shapes while also being more temporally stable. One pixel (or few pixels) changes do not induce drastic colour transfer and shape modification (when compared to SMAA 1X, FXAA 3.8/3.11 and older MLAA-based techniques).
Figure 7. Typical detection and handling of symmetrical Z shapes (circled in yellow)
Figure 8. All CMAA shapes: original image, edge detection and final anti-aliased image (with/without edges)
The sample UI allows a direct comparison between several anti-aliasing techniques that are selectable from within a drop down menu along with several debug features. All the techniques can be viewed in high detail using a zoom box that can be enabled from the UI and positioned by using a right mouse click. For both CMAA and SAA additional debug information is shown that highlights the actual edges that have been detected by the algorithm; slider allows you to adjust the threshold used for edge detection; In the case of CMAA both the edge threshold and the non-dominant edge removal threshold can be modified.
The effect on performance caused by modifying the threshold can be viewed if the application is run with vsync disabled. GPU performance metrics are displayed in the upper left hand corner of the application showing the overall cost of rendering the scene and the time taken in the post processing anti-aliasing code. When viewing the stats, additional debug information such as the zoombox and the edge view should be disabled as they both lower performance by forcing sub-optimal code paths to be used. When viewing CMAA in the zoombox with “Show Edges” enabled the zoombox will also animate showing the effects of applying CMAA to the image, this doesn’t affect the rest of the display.
For precise profiling of each technique, “Run benchmark for: …” button can be used to activate automatic multiple frame sampling and comparison, with results (cost delta compared to the base non-AA version) displayed in a message box after the run is finished.
Figure 9. Debug information including zoombox and edge detection overlay.
In addition to showing the effect of the various post-processing effects on the real-time scene the application allows a static image to be loaded and used as the source for the effect; the currently supported file format is PNG. A synthetic sample image is provided in the samples media directory (Figure 9). Attempting to run the sample with 2x and 4x MSAA will have no effect as these would normally affect the image source but CMAA, SMAA, FXAA and SAA can all be applied to the image. This feature quickly allows anyone considering using any of the post processing techniques in the sample to load images taken from their own application and experiment with the various threshold parameters.
The following figures show a number of quality and performance comparisons:
Figure 10. Performance impact (frames per second) of an older implementation of CMAA and MSAA on a Consumer Ultra-Low Voltage (CULV) i7-4610Y CPU with HD Graphics 4200 GPU, in GRID2 by Codemasters*.
Figure 11. Cost and scaling of CMAA 1.3 and other post-process anti-aliasing effects measured using the sample from the article, applied 10 screenshots from various games, averaged, on Intel 4th generation CPUs (HD 5000 and GD5200 graphics) and AMD R9-290.
Figure 12. Quality comparison for text and image anti-aliasing. CMAA 1.3 manages high quality anti-aliasing of the image while preserving text and without over blurring the geometry.
Figure 13. Quality comparison for synthetic shapes and game scenes with high frequency textures. CMAA preserves original high frequency texture data better than FXAA 3.11 and SMAA 1x, while still applying adequate anti-aliasing (although below the quality).
Figure 15. Impact of various techniques on GUI elements. Any post-process AA should always be applied before GUI to avoid unwanted blurring, but there are cases when this is unavoidable (such as in a driver implementation or if the GUI is part of the 3D.
This article was taken from a blog posting on Intel® DZ by Leigh Davies at Intel Corporation, highlighting work and results completed by Leigh and his colleague Filip Strugar in the new AA technique being referred to as Conservative Morphological Anti-Aliasing.
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice revision #20110804