Integer vs unsigned integer performance

Integer vs unsigned integer performance

Hi,

Long post, please bear with me ;)

Last weekend at a Hackathon we were discussing the performance of signed vs unsigned integers; at an HPC workshop someone mentioned to me that you should use "int" instead of "unsigned" or "size_t", as 'int' would be much faster.  Of course, nobody believed this, so I ended up cobbling together a piece of code to test it. The code is based on a (very dumb) example to determine whether a number is prime or not:

int isPrime(MyIntegerType n)
{
  if (n <= 1) return 0;
  for (MyIntegerType i = 2; i < n; i ++)
  {
     if (n % i == 0) return 0;
  }
  return 1;
}

I realize that this algorithm is very bad (and not even 100% correct, as one can aruge whether '2' is prime or not), and I realize that a compiler needs to do bound checking and that bounds checks are different for signed and unsigned integers. However, this is about the performance of using signed vs unsigned, nothing else.

With a wrapper around the above code I  ended up with a (64 bit) test program that tests this using
  MyIntegerType = {'int', 'unsigned', 'long', 'unsigned long', 'size_t' }
The results are quite surprising:

  1. It does not really matter which compiler you use for this; gcc 4.8, gcc 7 and icc  generate almost identical assembly code, but the assembly code for 'int' *IS*  different from the code generated for 'unsigned int'
  2. A good old Harpertown CPU can keep up in this test with CPUs that have a hi\gher clock speed and are 4+ years newer
  3. Performance differs per hardware platform using the same binary
  4. On Ivy Bridge, Haswell and Broadwell CPUs it makes sense to use 'int' instead of 'unsigned int'
  5. On KNL (+KNC)  and Atom based CPUs it's exactly the other way round: it makes sense to use 'unsigned' instead of 'int'
  6. On all others there is no significant difference between using or the other
  7. There is a *huge* penalty for using "long" on Intel CPUs
  8. Performance of the KNL box is surprisingly low, even given the fact that it runs at 1.3 or 1.5 GHz - compared to e.g. a 3.3 GHz Pentium G4400

Attached is a spreadsheet with all the platforms it was tested on so far.

Now for my questions:

  1. can someone explain the difference in execution time between different platforms , given the fact that the same executable was used on all platforms (hence no compiler differences) ?
  2. can someone explain why there is such a huge performance penalty for 64bit ints vs 32bit ints (and note that CPUs made by a certain rival company do not display this behaviour)
  3. how can a programmer know up-front what will be the best type for a given algorithm+CPU?
  4. why is performance on the KNL so bad compared to the rest ?

Source code and/or binaries for the above test results is available upon request.

Thx,  JJK

AttachmentSize
Downloadapplication/pdf signed_vs_unsigned.pdf30.09 KB
2 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

This: http://www.agner.org/optimize/instruction_tables.pdf

may shed some light on the issue.

Jim Dempsey

Leave a Comment

Please sign in to add a comment. Not a member? Join today