Intel® Advanced Encryption Standard (Intel® AES) Instructions Set - Rev 3.01


Intel® AES instructions are a new set of instructions available beginning with the all new 2010 Intel® Core™ processor family based on the 32nm Intel® microarchitecture codename Westmere. These instructions enable fast and secure data encryption and decryption, using the Advanced Encryption Standard (AES) which is defined by FIPS Publication number 197. Since AES is currently the dominant block cipher, and it is used in various protocols, the new instructions are valuable for a wide range of applications.

The architecture consists of six instructions that offer full hardware support for AES. Four instructions support the AES encryption and decryption, and other two instructions support the AES key expansion.

The AES instructions have the flexibility to support all usages of AES, including all standard key lengths, standard modes of operation, and even some nonstandard or future variants. They offer a significant increase in performance compared to the current pure-software implementations.

Beyond improving performance, the AES instructions provide important security benefits. By running in data-independent time and not using tables, they help in eliminating the major timing and cache-based attacks that threaten table-based software implementations of AES. In addition, they make AES simple to implement, with reduced code size, which helps reducing the risk of inadvertent introduction of security flaws, such as difficult-to-detect side channel leaks.

This paper gives an overview of the AES algorithm and Intel's new AES instructions. It provides guidelines and demonstrations for using these instructions to write secure and high performance AES implementations. This version of the paper also provides a high performance library for implementing AES in the ECB/CBC/CTR modes, and discloses for the first time, the measured performance numbers.

Additional Resources

Breakthrough AESPerformance with Intel AES-NI(PDF)

[Revisions history: Rev. 1.0 in 4/2008; Rev. 2.0 in 4/2009; Rev. 3.0 in 5/2010; Rev. 3.01 in 9/2012]

Download Article

PDF icon aes-wp-2012-09-22-v01.pdf2.93 MB
For more complete information about compiler optimizations, see our Optimization Notice.



I have a question about Figure 31 on page 42. Why do you recommend using two instructions (AESENCLAST and AESDEC) to isolate the InvMixColumns operation? I thought that the InvMixColumns operation is computed by the single instruction AESIMC (see Figure 20). Isn't this a faster way to do the same thing?

Yes, there is a typo in Figure 35. (“AES128-ECB Decryption with On-the-Fly Key Expansion”)
To correct:

Replace the line
rcon = _mm_set_epi32(0x1b, 0x1b, 0x1b, 0x1b);

rcon = _mm_set_epi32(con2,con2,con2,con2);

Replace the line

rcon = _mm_set_epi32(0x1, 0x1, 0x1, 0x1);

rcon = _mm_set_epi32(con1,con1,con1,con1);

I believe the code in the section "AES128-ECB Decryption with On-the-Fly Key Expansion" is incorrect. Initially I just unrolled the loops, and converted to asm, removed the useless shift after round 2, and didn't get the correct result. So I tried a few things that seemed like common sense, but no dice (things like not interpreting whats going on, and changing enc to dec). No dice obviously, looked a little closer and actually made an attemtp to understand rather than just assuming typos like that, rcon is initially set to 0x1b, and then right shifted. Shouldn't this be set to 0x36 then right shifted? Similarly, after the first 2 rounds of decryption, the value is set to 1, then right shifted (making it 0 immediatly), shouldn't this be set to 0x80? I seem to now be getting correct results with that. However, I had to work back through and undo all the stupid things I did initially to try and fix things. I think a few comments would go a long way to preventing the error, and should one happen, allow us to not interpret whats going on, but just search for the typo (hey, I can be lazy when someone else is optimizing the code right? :) ). Anyhow, there's the heads up on the error!

Were can I find the sample codes?

Are they public?

@mikeault: probably the most popular native library, OpenSSL, supports AES-NI functionality. You can access it through the command line (the openssl command) or through OpenSSL's libcrypto library. It is available on several platforms, including Solaris, Linux, MS Windows, and so forth. Intel's own IPP library supports AES-NI (Linux and MS Windows). A short list of crypto software supporting AES-NI is on Wikipedia's "AES instruction set" article,

Is there a supported native library for accessing the AES-NI functionality?
If so, where can I obtain it?

So far, I have discovered only the example code within the white paper referenced by this article, and
the sample code and library located at:

My intuition tells me that Intel never intended for every exploiter to write, debug, and support
his own native library to access the AES-NI functionality.

I am cautiously optimistic that I just haven't located it yet.

If performance is similar on all cores, this probably means the bottleneck isn't the CPU. When working with applications like TrueCrypt to copy large encrypted files, the disk throughput can often become the bottleneck.

Strangely enough the multithreading speed seems to be the same when using AES regardless of the Core model (i3, i5 or i7). For instance, comparing the results from this prog ( the speed with which it operates for me is the same on my i3 as my i7.

Here's the blog article, at (hopefully I can post URLs):

I have a blog article on the use of Intel AES-NI in Oracle's Sun Solaris Operating System to improve performance. Briefly, it's faster :-) and AES-NI is used and supported on Oracle Solaris 11 Express 2010.11 and on Solaris 10 10/09 (aka update 8).


Add a Comment

Have a technical question? Visit our forums. Have site or software product issues? Contact support.