Conservative Morphological Anti-Aliasing (CMAA) - March 2014 Update

This article was taken from a blog posting on IDZ by Leigh Davies at Intel Corp, highlighting work and results completed by Leigh and his colleague Filip Strugar in the new AA technique being referred to as Conservative Morphological Anti-Aliasing. Below is the content of the blog along with the available code project download for your examination 

This sample presents a new, image-based, post-processing antialiasing technique referred to as Conservative Morphological Anti-Aliasing and can be downloaded here. The technique was originally developed by Filip Strugar at Intel for use in GRID2 by Codemasters®, to offer a high performance alternative to traditional multi sample anti-aliasing (MSAA) while addressing artistic concerns with existing post-processing antialiasing techniques. The sample allows CMAA to be compared with several popular post processing techniques together with hardware MSAA in a real time rendered scene as well as to an existing image. The scene is rendered using a simple HDR technique and includes basic animation to allow the user to compare how the different techniques cope with temporal artifacts in addition to static portions of the image.

CMAA Sample using HDR & animating geometry

MSAA has long been used to reduce aliasing in computer games and significantly improve their visual appearance. Basic MSAA works by running the pixel shader once per pixel but running the coverage and occlusion tests at higher than normal resolution, typically 2x through 8x, and then merging the results together. While significantly faster than super sampling it still represents a significant additional cost compared to no anti-aliasing and is difficult to implement with certain techniques – for example, this sample uses a custom fullscreen pass needed to get correct post HDR tone-mapping MSAA resolve [Humus article in ShaderX6] [6].

An alternative to MSAA is to use an image-based post-process anti-aliasing (PPAA), which became practical with GPU ports of Morphological antialiasing (MLAA) [Reshetov 2009] [1] and further developments such as “Enhanced Subpixel Morphological Antialiasing“(SMAA) [2] and NVidia’s “Fast approximate anti-aliasing” (FXAA) [3]. Compared to MSAA, these PPAA techniques are easy to implement and work in scenarios where MSAA does not (such as deferred lighting and other non-geometry based aliasing), but lack adequate sub-pixel accuracy and are less temporally stable. They also cause perceptible blurring of textures and text, since it is difficult for edge-detection algorithms to distinguish between intentional colour discontinuities and unwanted aliasing caused by imperfect rendering.
Currently two of the most popular PPAA algorithms are:

  1. SMAA is an algorithm based on MLAA but with a number of innovations and improvements, and with a number of quality/performance presets. It implements advanced pattern recognition and local contrast adaptation, and the more expensive variations use temporal super-sampling to reduce temporal instability and improve quality. The SMAA algorithm version referenced in this document is the latest public code v2.7.
  2. FXAA is a much faster effect. However, FXAA has simpler colour discontinuity shape detection, causing substantial (frequently unwanted) image blurring. It also has fairly limited kernel size by default, so it doesn't sufficiently anti-alias longer edge shapes, while increasing the kernel size impacts performance significantly. FXAA algorithm version referenced in this document is v3.8 unless otherwise specified (newest v3.11 was added to the sample in the last minute, in addition to 3.8).

In this sample we introduce a new technique called Conservative Morphological Anti-Aliasing (CMAA). CMAA addresses two requirements that are currently not addressed by existing techniques:

  1. To run efficiently on low-medium range GPU hardware, such as integrated GPUs, while providing a quality anti-aliasing solution. A budget under 3ms was used as a guide when developing the technique at a resolution of 1600x900 running on a 15watt, 4th Generation Intel® Core™ processor.
  2. To be minimally invasive so it can be acceptable as a replacement to 2xMSAA in a wide range of applications, including worst case scenarios such as text, repeating patterns, certain geometries (power lines, mesh fences, foliage), and moving images.

CMAA is positioned between FXAA and SMAA 1x in computation cost (0.9-1.2x the cost of default FXAA 3.11 and 0.45-0.75x the cost of SMAA 1x) on Intel 4th Generation HD Graphics hardware and above. Compared to FXAA, CMAA provides significantly better image quality and temporal stability as it correctly handles edge lines up to 64 pixels long and is based on an algorithm that only handles symmetrical discontinuities in order to avoid unwanted blurring (thus being more conservative). When compared to SMAA 1x  it will provide less anti-aliasing as it handles fewer shape types but also causes less blurring, shape distortion, and has more temporal stability (is less affected by small frame-to-frame image changes).

CMAA has four basic logical steps (not necessarily matching the order in the implementation):

  1. Image analysis for colour discontinuities (afterwards stored in a local compressed 'edge' buffer). The method used is not unique to CMAA.
  2. Extracting locally dominant edges with a small kernel. (Unique variation of existing algorithms).
  3. Handling of simple shapes.
  4. Handling of symmetrical long edge shape. (Unique take on the original MLAA shape handling algorithm.)

Step 1: Image analysis for colour discontinuities (edges)

Edge detection is performed by comparing neighboring colours using:

  • Sum of per-channel Luma-weighted colour difference in sRGB colour space (default)
  • Luminance value calculated from the input in sRGB colour space (faster)
  • Weighted Euclidean distance [6] (highest quality, slowest)

An edge (discontinuity) exists if the difference of neighboring pixel values is above a preset threshold (which is determined empirically).

 dot( abs(colorA.rgb-colorB.rgb), float3(0.2126,0.7152,0.0722)) > fThreshold

Step 2: Locally dominant edge detection (or, non-dominant edge pruning)
This step serves a similar function to “local contrast adaptation” in SMAA and “local contrast
test” in FXAA but with a smaller kernel. For each edge detected in Step 1, colour delta value above threshold (dEc) is compared to that of neighboring 12 edges (dEn):

 The edge remains an edge if its dEc > lerp( average(dEn), max(dEn), ldeFactor), where ldeFactor is empirically chosen (defaults to 0.35).

This smaller local adaptation kernel size is somewhat less efficient at increasing effective edge detection range. However, it is more effective at preventing blurring of small shapes (such as text), reducing local shape interference from less noticeable edges, avoiding some of the pitfalls of large kernels (visible kernel-sized transition from un-blurred to blurry), and has better performance.

Step 3: Handling of simple shapes

Edges detected in step 1, and refined in step 2, are used to make assumptions about the shape of the underlying edge before rasterization (virtual shape). For simple shape handling, all pixels are analyzed for existence of 2, 3 and 4 edge aliasing shapes, and colour transfer is applied to match the virtual shape colour coverage and achieve the local anti-aliasing effect (Figure 4). While this colour transfer is not always symmetrical, the amount of shape distortion is minimized to sub-pixel size.

 2-edge, 3-edge and 4-edge shapes; reconstructed virtual shape shown in yellow; black arrows showing anti-aliasing colour blending direction

4a. Shape rasterization pixel step (which is mostly a triangle edge). Criterion used for this detection is illustrated in Figure 5. Four Z-shape orientations (with 90° difference) are handled.

 Z-shape detection criterion is true if edges illustrated blue are present while red ones are not; green arrows show subsequent edge tracing

4b. For each detected Z-shape, the length of the edge to the left/right is determined by tracing the horizontal (for two horizontal Z-shapes) edges on both sides, and stopping if none are present on either side, or a vertical edge is encountered (Figure 6).
4c. The edge length from the previous step is used to reconstruct the location of the virtual shape edge and apply colour transfer (to both sides of the Z-shape) to match coverage that it would have at each pixel. This step overrides any anti-aliasing done in Step 3 on the same pixels.

 edge length tracing marked blue, with Z shape at center; reconstructed virtual shape shown in yellow; black arrows showing anti-aliasing colour blending direction

The inherent symmetry of this approach better preserves overall image average colour and shape, ignores borderline cases, and better preserves original shapes while also being more temporally stable.  One pixel (or few pixels) changes do not induce drastic colour transfer and shape modification (when compared to SMAA 1X, FXAA 3.8/3.11 and older MLAA-based techniques).

 Typical detection and handling of symmetrical Z shapes (circled in yellow)

 original image, edge detection and final anti-aliased image (with/without edges)

The sample UI allows a direct comparison between several anti-aliasing techniques that are selectable from within a drop down menu along with several debug features. All the techniques can be viewed in high detail using a zoom box that can be enabled from the UI and positioned by using a right mouse click. For both CMAA and SAA additional debug information is shown that highlights the actual edges that have been detected by the algorithm; slider allows you to adjust the threshold used for edge detection; In the case of CMAA both the edge threshold and the non-dominant edge removal threshold can be modified.

The effect on performance caused by modifying the threshold can be viewed if the application is run with vsync disabled. GPU performance metrics are displayed in the upper left hand corner of the application showing the overall cost of rendering the scene and the time taken in the post processing anti-aliasing code. When viewing the stats, additional debug information such as the zoombox and the edge view should be disabled as they both lower performance by forcing sub-optimal code paths to be used. When viewing CMAA in the zoombox with ”Show Edges” enabled the zoombox will also animate showing the effects of applying CMAA to the image, this doesn’t affect the rest of the display.

For precise profiling of each technique, “Run benchmark for: …” button can be used to activate automatic multiple frame sampling and comparison, with results (cost delta compared to the base non-AA version) displayed in a message box after the run is finished.

 Debug information including zoombox and edge detection overlay.

In addition to showing the effect of the various post-processing effects on the real-time scene the application allows a static image to be loaded and used as the source for the effect; the currently supported file format is PNG. A synthetic sample image is provided in the samples media directory (Figure 9). Attempting to run the sample with 2x and 4x MSAA will have no effect as these would normally affect the image source but CMAA, SMAA, FXAA and SAA can all be applied to the image. This feature quickly allows anyone considering using any of the post processing techniques in the sample to load images taken from their own application and experiment with the various threshold parameters.

The following figures show a number of quality and performance comparisons:

 Performance impact (frames per second) of an older implementation of CMAA and MSAA on a Consumer Ultra-Low Voltage (CULV) i7-4610Y CPU with HD Graphics 4200 GPU, in GRID2 by Codemasters®

 Cost and scaling of CMAA 1.3 and other post-process anti-aliasing effects measured using the sample from the article, applied 10 screenshots from various games, averaged, on Intel 4th generation CPUs (HD 5000 and GD5200 graphics) and AMD R9-290

 Quality comparison for text and image anti-aliasing. CMAA 1.3 manages high quality anti-aliasing of the image while preserving text and without over blurring the geometry.

 Quality comparison for synthetic shapes and game scenes with high frequency textures. CMAA preserves original high frequency texture data better than FXAA 3.11 and SMAA 1x, while still applying adequate anti-aliasing (although below the quality

 Quality comparison in a game 3D scene. CMAA preserves most of the original high frequency texture data and original geometry shapes, while still applying adequate anti-aliasing.

 Impact of various techniques on GUI elements. Any post-process AA should always be applied before GUI to avoid unwanted blurring, but there are cases when this is unavoidable (such as in a driver implementation or if the GUI is part of the 3D s

[1] MLAA, Reshetov, A. (2009). Morphological antialiasing. In HPG’09: Proceedings of the
Conference on High Performance Graphics 2009, pages 109–116, New York,NY, USA. ACM
[2] SMAA, Jorge Jimenez and Jose I. Echevarria and Tiago Sousa and Diego Gutierrez 2012,
"SMAA: Enhanced Morphological Antialiasing", Computer Graphics Forum (Proc.
[3] NVidia "Fast approximate anti-aliasing" (FXAA), Timothy Lottes (2011),
[4] Venceslas Biri, Adrien Herubel, and Stephane Deverly. 2010. Practical morphological
antialiasing on the GPU. In ACM SIGGRAPH 2010 Talks (SIGGRAPH '10). ACM, New York, NY,
USA, , Article 45 , 1 pages. DOI=10.1145/1837026.1837085

[6] "Post-tonemapping resolve for high quality HDR antialiasing in D3D10" in ShaderX6

Graphics Tech Samples
This article was taken from a blog posting on IDZ by Leigh Davies at Intel Corp, highlighting work and results completed by Leigh and his colleague Filip Strugar in the new AA technique being referred to as Conservative Morphological Anti-Aliasing.
For more complete information about compiler optimizations, see our Optimization Notice.
File CMAA1.3_26March2014.7z76.78 MB


Filip Strugar (Intel)'s picture

Hi Masaya T. - I only noticed the comments now so late response.

You are absolutely correct, this is a bug, should have a "+" there. Looking at the logs, there was a late optimisation that introduced this bug. I'm working on a new higher quality version - I've just checked it to make sure it doesn't have the same bug and it looks ok. I hope to publish this update early next year.

Hi Stanislav D. - yeah I think once the new update is out, I'll find someone to help me out with integrating it into Unity!

Masaya T.'s picture

It seems to me that the calculation for an average of 12 edges in PruneNonDominantEdges() is not your intention.

It might be 

avg += dot(....

Anyway, Thanks for sharing valuable shader codes!


Marius P.'s picture

Congratulation for this work! Can you include also the 2010 vs projects files?I downloaded the archive but it include only *.sln file.

Filip Strugar (Intel)'s picture

I get the same occasionally (download stops at 50.6MB). However I found that if you use Internet Explorer, it will say "Download Interrupted" at 50.6MB but then you can chose to "Resume" and it will continue and download the whole archive which can then be extracted fine - bizarre.

Haruto W.'s picture

Hmm, My browser says "the file download is completed". but the downloaded 7z file is incomplete size.

It is about 50MB only and can not extract.


My acquaintance in Japan has also failed to download.

Can you extract it?

That means that there is no comment so far, other people could successfully downloaded?

Robert Svilpa (Intel)'s picture

I just confirmed the file download - did you get any message stating an explicit error?

Haruto W.'s picture

I can't download the file.

Has not the file broken?

Add a Comment

Have a technical question? Visit our forums. Have site or software product issues? Contact support.