ippsCompare_8u much slower then strcmp()

ippsCompare_8u much slower then strcmp()

Hi,

I have been trying to compare these two functions as I have been told that ipp are much faster then C string functions.

However, I really could write a single example where ippsCompare_8u is -at least- as fast as strcmp(). There's only one note about ippsCompare that says: "for long strings" (whatever it means).

My example is really simple. I'm dynamically allocating the two strings to compare so in this way -I guess- the compiler shouldn't replace strcmp()/strlen() with simple constants.

Compilation: icpc -O3 ipp.cpp -I /opt/apps/intel/ipp/include/ -L /opt/apps/intel/ipp/lib/intel64/ -lippch_l -lippcore_l

I have the ipp version bundled with composer_xe_2013 and my hardware is RedHat Linux 64 with 4 Intel(R) Xeon(R) CPU X5690  @ 3.47GHz (6 cores for each CPU) with 200GB of RAM.

I'm misuring the execution time with "time" command on Linux. The strings I considered during my tests had a  100000000-char length.

My questions are:

  1. Am I wrongly using ippsCompare?
  2. Is my assumption correct? (ippsCompare should -for some "long" string be faster then strcmp()). if yes, what's the meaning of "long string"?

Code sample:

 7 int main(int argc, char **argv)
    8 {
    9     ippInit();
   10
   11     size_t first_length = atoi(argv[2]);
   12     char *first = new char[first_length+1];
   13     for (size_t i=0; i<first_length; ++i)
   14         first[i]='A';
   15     first[first_length]='\0';
   16     size_t second_length = atoi(argv[3]);
   17     char *second = new char[second_length+1];
   18     for (size_t i=0; i<second_length; ++i)
   19         second[i]='A';
   20     second[second_length]='\0';
   21     second[second_length-1]='B';
   22
   23     //std::string ff=first;
   24     //std::string ss=second;
   25     //std::string ff(first), ss(second);
   26     size_t hmt = atol(argv[1]);
   27     long long final_result = 0;
   28     size_t counter = 0;
   29     for (size_t i=0; i<hmt; ++i)
   30     {
   31         /**
   32          * [first test]
   33          *
   34          *  Ipp8u *f = (Ipp8u*)first, *s = (Ipp8u*)second;
   35          *  size_t max_length = std::max(strlen(first), strlen(second));
   36          *  int result = 0;
   37          *  IppStatus rc = ippsCompare_8u(f, s, max_length, &result);
   38          */
   39
   40         /**
   41          * [second test]
   42          */
   43         int result = strcmp(first, second);
   44
   45         final_result += result;
   46     }
   47
   48     std::cout <<final_result <<std::endl;
   49     delete[] first;
   50     delete[] second;
   51     return 0;
   52 } 

6 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

Hi Luca,

Could you repeat your experiment without "strlen" in IPP path?

"Strlen" is also a function, which scans the whole string looking for 0, and it also takes time.

Regards,
Sergey

Regards,
Sergey

Hi,

thank you for your answer. Because I know yet the length of the input strings, I have changed the code in order to get rid of those 2 strlen():

size_t max_length = std::max(/*strlen(first)*/first_length, /*strlen(second)*/second_length);

Still it really didn't make any difference:

INTEL IPP version: (the input params are: <how_many_iterations_to_repeat>, <first_string_length>, <second_string_length>)

bash-4.1$ time ./a.out 100 100000000 100000000
-100

real    0m1.765s
user    0m1.726s
sys     0m0.035s
bash-4.1$ time ./a.out 1000 100000000 100000000
-1000

real    0m17.546s
user    0m17.471s
sys     0m0.035s

STRCMP version:

bash-4.1$ time ./a.out 100 100000000 100000000
-100

real    0m0.122s
user    0m0.085s
sys     0m0.037s
bash-4.1$ time ./a.out 1000 100000000 100000000
-1000

real    0m0.122s
user    0m0.086s
sys     0m0.036s

Hi Luca,

That's a challenge! :).

Could you update your code the following way: after line 43 (strcmp or ippsCompare call) add "second[second_length-1] = i; " ?
You'll probably see the different timing values. 
Don't look at final_result difference. It's because strcmp and ippsCompare differ in return values.

P.S.: A hint from your message:

     STRCMP version:

    bash-4.1$ time ./a.out 100 100000000 100000000
    -100

    real    0m0.122s
    user    0m0.085s
    sys     0m0.037s
    bash-4.1$ time ./a.out 1000 100000000 100000000
    -1000

    real    0m0.122s
    user    0m0.086s
    sys     0m0.036s

Regards,
Sergey

Regards,
Sergey

Hi Sergey,

you convinced me :) Even for 256-length strings ippsCompare is reallyyyyyyy much much faster. So in the end my test wasn't good enough. I really didn't know the compiler was so smart to be able tor recognize that the same 2 strings were always being compared. So probably what's really happening when strcmp() is used is:

1. a first call to strcmp() is used and the result is cached.

2. for all the others N-1 iterations we just use the cached value. By changing at least one of those string (for each iteration) we have N calls to strcmp() and that's the real time we should compare.

Anyway, thank you very much for your help!

Luca

Yes, you're right. This happens often with Intel compiler. It throws constant expressions away.

Regards,
Sergey

Regards,
Sergey

Leave a Comment

Please sign in to add a comment. Not a member? Join today