Circular rotate that does not violate C/C++ standard?

Circular rotate that does not violate C/C++ standard?

Intel compilers lack an intrinsic to access the underlying rotate instruction. We are having trouble crafting a portable circular rotate that executes in near constant time, does not violate the standards and is guaranteed to reduce to one machine instruction.

First, the naive implementation for left rotate on a 32-bit unsigned int:

    unsigned int y = (x << n) | (x >> (32 - n));

The code above suffers undefined behavior when n = 0. So a second attempt:

    unsigned int y = n ? ((x << n) | (x >> (32 - n))) : x;

The above introduces a branch, which is less efficient. Additionally, it may offer an attacker the ability to perform timing analysis in security sensitive and high integrity software.

GCC identified the issue and decided to recognize the following pattern (cf., "Poor optimization of portable rotate idiom", http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57157):

    unsigned int y = (x << n) | (x >> (-n & 31));

My question is, what circular rotate pattern does ICC/ICPC recognize to ensure the C/C++ statements are reduced to a single ASM instruction?

4 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

The Intel Compiler does support intrinsics that map to the scalar rotate instructions. In the Intel Intrinsics Guide that you referenced, click on "Other" and search by "rot". That will get you documentation on these 6 intrinsics: _lrotl, _lrotr, _rotl, _rotr, _rotwl, _rotwr. These provide left & right rotation of unsigned short, int, and long and give well-defined results for all values of n.

In addition to supporting the intrinsics, the Intel Compiler will automatically recognize some C rotation idioms, including the ones that you mentioned. Whether a particular idiom violates the standard depends on the range of n. If n is known to be in the range [0, 31], then the GCC pattern is suitable. If you want a well-defined idiom that works for all values of n, you could modify it slightly:

    unsigned int y = x << (n & 31) | x >> (-n & 31);

This idiom is recognized and optimized by the Intel Compiler.

David Kreitzer
Intel Compilers

 

Thanks David, perfect.

Related:

> The Intel Compiler does support intrinsics that map to the scalar rotate instructions.

I searched for the word "rotate" in my browser. I thought I searched for "rot". I got 0 hits. I also searched for the same terms in the page's search box, which only resulted in _mm_* functions (and not the scalar functions).

Maybe the page is broken since it did not return the expected results? (Sorry if I sound harsh. After years of dealing with broken links, I have little sympathy for web masters and they pages they provide).

Here's a direct link to the list of scalar rotate intrinsics:

https://software.intel.com/sites/landingpage/IntrinsicsGuide/#text=rot&t...

Leave a Comment

Please sign in to add a comment. Not a member? Join today