Note: The conservative morphological anti-aliasing 2.0 (CMAA2) algorithm is a significant update of the original algorithm presented here.
This article was taken from a blog posting on Intel® Developer Zone (Intel® DZ) by Leigh Davies at Intel Corporation, highlighting work and results completed by Leigh and his colleague Filip Strugar in the new AA technique being referred to as Conservative Morphological Anti-Aliasing. Below is the content of the blog along with the available code project download for your examination.
This sample presents a new, image-based, post-processing antialiasing technique referred to as Conservative Morphological Anti-Aliasing and can be downloaded here. The technique was originally developed by Filip Strugar at Intel for use in GRID2 by Codemasters*, to offer a high performance alternative to traditional multi sample anti-aliasing (MSAA) while addressing artistic concerns with existing post-processing antialiasing techniques. The sample allows CMAA to be compared with several popular post processing techniques together with hardware MSAA in a real time rendered scene as well as to an existing image. The scene is rendered using a simple HDR technique and includes basic animation to allow the user to compare how the different techniques cope with temporal artifacts in addition to static portions of the image.
Figure 1. CMAA Sample using HDR and animating geometry
MSAA has long been used to reduce aliasing in computer games and significantly improve their visual appearance. Basic MSAA works by running the pixel shader once per pixel but running the coverage and occlusion tests at higher than normal resolution, typically 2x through 8x, and then merging the results together. While significantly faster than super sampling it still represents a significant additional cost compared to no anti-aliasing and is difficult to implement with certain techniques – for example, this sample uses a custom fullscreen pass needed to get correct post HDR tone-mapping MSAA resolve6.
Post-Process Anti-Aliasing (PPAA)
An alternative to MSAA is to use an image-based post-process anti-aliasing (PPAA), which became practical with GPU ports of Morphological antialiasing (MLAA)1 and further developments such as “Enhanced Subpixel Morphological Antialiasing”(SMAA)2 and NVidia’s “Fast approximate anti-aliasing” (FXAA)3. Compared to MSAA, these PPAA techniques are easy to implement and work in scenarios where MSAA does not (such as deferred lighting and other non-geometry based aliasing), but lack adequate sub-pixel accuracy and are less temporally stable. They also cause perceptible blurring of textures and text, since it is difficult for edge-detection algorithms to distinguish between intentional colour discontinuities and unwanted aliasing caused by imperfect rendering.
Currently two of the most popular PPAA algorithms are:
Is an algorithm based on MLAA but with a number of innovations and improvements, and with a number of quality/performance presets. It implements advanced pattern recognition and local contrast adaptation, and the more expensive variations use temporal super-sampling to reduce temporal instability and improve quality. The SMAA algorithm version referenced in this document is the latest public code v2.7.
Is a much faster effect. However, FXAA has simpler colour discontinuity shape detection, causing substantial (frequently unwanted) image blurring. It also has fairly limited kernel size by default, so it doesn't sufficiently anti-alias longer edge shapes, while increasing the kernel size impacts performance significantly. FXAA algorithm version referenced in this document is v3.8 unless otherwise specified (newest v3.11 was added to the sample in the last minute, in addition to 3.8).
A New Technique - Conservative Morphological Anti-Aliasing (CMAA)
In this sample we introduce a new technique called Conservative Morphological Anti-Aliasing (CMAA). CMAA addresses two requirements that are currently not addressed by existing techniques:
- To run efficiently on low-medium range GPU hardware, such as integrated GPUs, while providing a quality anti-aliasing solution. A budget under 3ms was used as a guide when developing the technique at a resolution of 1600x900 running on a 15 Watt, 4th generation Intel® Core™ processor.
- To be minimally invasive so it can be acceptable as a replacement to 2xMSAA in a wide range of applications, including worst case scenarios such as text, repeating patterns, certain geometries (power lines, mesh fences, foliage), and moving images.
CMAA is positioned between FXAA and SMAA 1x in computation cost (0.9-1.2x the cost of default FXAA 3.11 and 0.45-0.75x the cost of SMAA 1x) on 4th generation Intel® HD Graphics hardware and above. Compared to FXAA, CMAA provides significantly better image quality and temporal stability as it correctly handles edge lines up to 64 pixels long and is based on an algorithm that only handles symmetrical discontinuities in order to avoid unwanted blurring (thus being more conservative). When compared to SMAA 1x it will provide less anti-aliasing as it handles fewer shape types but also causes less blurring, shape distortion, and has more temporal stability (is less affected by small frame-to-frame image changes).
CMAA has four basic logical steps (not necessarily matching the order in the implementation):
- Image analysis for colour discontinuities (afterwards stored in a local compressed 'edge' buffer). The method used is not unique to CMAA.
- Extracting locally dominant edges with a small kernel. (Unique variation of existing algorithms).
- Handling of simple shapes.
- Handling of symmetrical long edge shape. (Unique take on the original MLAA shape handling algorithm.)
Step 1. Image analysis for colour discontinuities (edges)
Edge detection is performed by comparing neighboring colours using:
- Sum of per-channel Luma-weighted colour difference in sRGB colour space (default)
- Luminance value calculated from the input in sRGB colour space (faster)
- Weighted Euclidean distance 6 (highest quality, slowest)
An edge (discontinuity) exists if the difference of neighboring pixel values is above a preset threshold (which is determined empirically).
Figure 2. Showing results of a default edge detection algorithm.
Step 2. Locally dominant edge detection (or, non-dominant edge pruning)
This step serves a similar function to “local contrast adaptation” in SMAA and “local contrast test” in FXAA but with a smaller kernel. For each edge detected in Step 1, colour delta value above threshold (dEc) is compared to that of neighboring 12 edges (dEn):
Figure 3. Neighboring edges considered for local contrast adaptation.
This smaller local adaptation kernel size is somewhat less efficient at increasing effective edge detection range. However, it is more effective at preventing blurring of small shapes (such as text), reducing local shape interference from less noticeable edges, avoiding some of the pitfalls of large kernels (visible kernel-sized transition from un-blurred to blurry), and has better performance.
Step 3. Handling of simple shapes
Edges detected in step 1, and refined in step 2, are used to make assumptions about the shape of the underlying edge before rasterization (virtual shape). For simple shape handling, all pixels are analyzed for existence of 2, 3 and 4 edge aliasing shapes, and colour transfer is applied to match the virtual shape colour coverage and achieve the local anti-aliasing effect (Figure 4). While this colour transfer is not always symmetrical, the amount of shape distortion is minimized to sub-pixel size.
Figure 4. 2-edge, 3-edge and 4-edge shapes; reconstructed virtual shape shown in yellow; black arrows showing anti-aliasing colour blending direction
4a. Shape rasterization pixel step (which is mostly a triangle edge). Criterion used for this detection is illustrated in Figure 5. Four Z-shape orientations (with 90° difference) are handled.
Figure 5. Z-shape detection criterion is true if edges illustrated blue are present while red ones are not; green arrows show subsequent edge tracing.
4b. For each detected Z-shape, the length of the edge to the left/right is determined by tracing the horizontal (for two horizontal Z-shapes) edges on both sides, and stopping if none are present on either side, or a vertical edge is encountered (Figure 6).
4c. The edge length from the previous step is used to reconstruct the location of the virtual shape edge and apply colour transfer (to both sides of the Z-shape) to match coverage that it would have at each pixel. This step overrides any anti-aliasing done in Step 3 on the same pixels.
Figure 6. Long edge (Z-shapes): edge length tracing marked blue, with Z shape at center; reconstructed virtual shape shown in yellow; black arrows showing anti-aliasing colour blending direction.
The inherent symmetry of this approach better preserves overall image average colour and shape, ignores borderline cases, and better preserves original shapes while also being more temporally stable. One pixel (or few pixels) changes do not induce drastic colour transfer and shape modification (when compared to SMAA 1X, FXAA 3.8/3.11 and older MLAA-based techniques).
Figure 7. Typical detection and handling of symmetrical Z shapes (circled in yellow)
Figure 8. All CMAA shapes: original image, edge detection and final anti-aliased image (with/without edges)
The sample UI allows a direct comparison between several anti-aliasing techniques that are selectable from within a drop down menu along with several debug features. All the techniques can be viewed in high detail using a zoom box that can be enabled from the UI and positioned by using a right mouse click. For both CMAA and SAA additional debug information is shown that highlights the actual edges that have been detected by the algorithm; slider allows you to adjust the threshold used for edge detection; In the case of CMAA both the edge threshold and the non-dominant edge removal threshold can be modified.
The effect on performance caused by modifying the threshold can be viewed if the application is run with vsync disabled. GPU performance metrics are displayed in the upper left hand corner of the application showing the overall cost of rendering the scene and the time taken in the post processing anti-aliasing code. When viewing the stats, additional debug information such as the zoombox and the edge view should be disabled as they both lower performance by forcing sub-optimal code paths to be used. When viewing CMAA in the zoombox with “Show Edges” enabled the zoombox will also animate showing the effects of applying CMAA to the image, this doesn’t affect the rest of the display.
For precise profiling of each technique, “Run benchmark for: …” button can be used to activate automatic multiple frame sampling and comparison, with results (cost delta compared to the base non-AA version) displayed in a message box after the run is finished.
Figure 9. Debug information including zoombox and edge detection overlay.
In addition to showing the effect of the various post-processing effects on the real-time scene the application allows a static image to be loaded and used as the source for the effect; the currently supported file format is PNG. A synthetic sample image is provided in the samples media directory (Figure 9). Attempting to run the sample with 2x and 4x MSAA will have no effect as these would normally affect the image source but CMAA, SMAA, FXAA and SAA can all be applied to the image. This feature quickly allows anyone considering using any of the post processing techniques in the sample to load images taken from their own application and experiment with the various threshold parameters.
The following figures show a number of quality and performance comparisons:
Figure 10. Performance impact (frames per second) of an older implementation of CMAA and MSAA on a Consumer Ultra-Low Voltage (CULV) i7-4610Y CPU with HD Graphics 4200 GPU, in GRID2 by Codemasters*.
Figure 11. Cost and scaling of CMAA 1.3 and other post-process anti-aliasing effects measured using the sample from the article, applied 10 screenshots from various games, averaged, on Intel 4th generation CPUs (HD 5000 and GD5200 graphics) and AMD R9-290.
Figure 12. Quality comparison for text and image anti-aliasing. CMAA 1.3 manages high quality anti-aliasing of the image while preserving text and without over blurring the geometry.
Figure 13. Quality comparison for synthetic shapes and game scenes with high frequency textures. CMAA preserves original high frequency texture data better than FXAA 3.11 and SMAA 1x, while still applying adequate anti-aliasing (although below the quality).
Figure 15. Impact of various techniques on GUI elements. Any post-process AA should always be applied before GUI to avoid unwanted blurring, but there are cases when this is unavoidable (such as in a driver implementation or if the GUI is part of the 3D.
- MLAA, Reshetov, A. (2009). Morphological antialiasing. In HPG’09: Proceedings of the Conference on High Performance Graphics 2009, pages 109–116, New York, NY, USA. ACM
- SMAA, Jorge Jimenez and Jose I. Echevarria and Tiago Sousa and Diego Gutierrez 2012, JIMENEZ2012_SMAA, "SMAA: Enhanced Morphological Antialiasing", Computer Graphics Forum (Proc.EUROGRAPHICS 2012)
- NVidia Fast approximate anti-aliasing (FXAA), Timothy Lottes (2011)
- Venceslas Biri, Adrien Herubel, and Stephane Deverly, 2010. Practical morphological antialiasing on the GPU
(PDF). In ACM SIGGRAPH 2010 Talks (SIGGRAPH '10). ACM, New York, NY, USA, , Article 45 , 1 page.
- Practical morphological antialiasing on the GPU
- Post-tonemapping resolve for high quality HDR antialiasing in D3D10 in ShaderX6
This article was taken from a blog posting on Intel® DZ by Leigh Davies at Intel Corporation, highlighting work and results completed by Leigh and his colleague Filip Strugar in the new AA technique being referred to as Conservative Morphological Anti-Aliasing.