Cycle counts of the new Westmere instructions

Cycle counts of the new Westmere instructions

How many cycles do the new instructions require
and can they be paired with other intructions?

aesimc
aeskeygenassist
aesenc
aesenclast
aesdec
aesdeclast
pclmulqdq

3 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

The AES-NI white paper has some performance results from which you could estimate the instruction latencies: http://software.intel.com/en-us/articles/intel-advanced-encryption-standard-aes-instructions-set/. It explicitly mentions that they are pipelined too.

You can expect pclmulqdq to perform the same as other vector multiplications.

it changes by impl, you can take your aes kernel(s) and run it through the CodenAnalyzer to understand tput, latency, etc
http://software.intel.com/en-us/articles/intel-architecture-code-analyzer/

Login to leave a comment.