Understanding a vectorization report with contradicting lines

Understanding a vectorization report with contradicting lines

imagem de Andrey Vladimirov

Hello,

I am trying to understand the inner monologue of the Intel compiler when it compiles a loop with assumed vector dependence:

void SomeFunction(const int n, int *a, int *b) 
{ 
#pragma ivdep
 for (int i = 0; i < n; i++) 
 a[i] = b[i];
}

Here is the result:

[avladim@dublin ~]$ icpc -c -vec-report3 ivdep.cc
ivdep.cc(4): (col. 3) remark: LOOP WAS VECTORIZED.
ivdep.cc(4): (col. 3) remark: loop was not vectorized: not inner loop.
[avladim@dublin ~]$

The problem here is that the first line of the output contradicts the second line.

This is a trivial example, and I know that in this case, the loop is vectorized, because if I do something funny like "SomeFunction(100, a+1, a)", the application will crash. But for more complex applications, I would like to know: when I see messages like this, should I trust the last line or the line in capital letters, or is there some other output that I can look at?

Thanks!

Andrey

5 posts / 0 new
Último post
Para obter mais informações sobre otimizações de compiladores, consulte Aviso sobre otimizações.
imagem de Tim Prince

Often, such messages are an indication the compiler has generated multiple versions, with selection to be made at run time.  This one is a surprise, particularly since it doesn't appear to be checking for your bad case.  I'd be curious if the same thing happens when vectorization is promoted by * restrict.

imagem de Andrey Vladimirov

The same message and the same result with the keyword restrict. And I made a test run to confirm that vectorization does occur (see below). Probing the fool-proofness of the compiler may not be the most useful thing to do, but I just want to better understand the diagnostic messages. Thank you for the insight!

1 #include <cstdio>
 2 
 3 const int N = 100;
 4 int A[N];
 5 
 6 void MyCopyScalar(int n, int* a, int* b) { 
 7 for (int i = 0; i < N; i++)
 8 a[i] = b[i]; 
 9 }
 10 
 11 void MyCopyIvdep(int n, int* a, int* b) { 
 12 #pragma ivdep
 13 for (int i = 0; i < N; i++)
 14 a[i] = b[i]; 
 15 }
 16 
 17 void MyCopyRestrict(int n, int* restrict a, int* restrict b) { 
 18 for (int i = 0; i < N; i++)
 19 a[i] = b[i]; 
 20 }
 21 
 22 void Init() { 
 23 for (int i = 0; i < N; i++) 
 24 A[i] = i; 
 25 }
 26 
 27 void Output(const char* msg) {
 28 printf("%s:n", msg);
 29 for (int i = 0; i < 12; i++) 
 30 printf("%4d", A[i]); printf("...nn");
 31 }
 32 
 33 int main() {
 34 
 35 Init();
 36 Output("Original array");
 37 
 38 Init(); 
 39 MyCopyScalar(N-1, &A[1], &A[0]);
 40 Output("After MyCopyScalar (correct result)");
 41 
 42 Init(); 
 43 MyCopyIvdep(N-1, &A[1], &A[0]);
 44 Output("After MyCopyIvdep");
 45 
 46 Init(); 
 47 MyCopyRestrict(N-1, &A[1], &A[0]);
 48 Output("After MyCopyRestrict");
 49 
 50 }

[avladim@dublin ~]$ icpc -c vectorcopy.cc -restrict -vec-report3
vectorcopy.cc(35): (col. 3) remark: LOOP WAS VECTORIZED.
vectorcopy.cc(36): (col. 3) remark: loop was not vectorized: existence of vector dependence.
vectorcopy.cc(38): (col. 3) remark: LOOP WAS VECTORIZED.
vectorcopy.cc(39): (col. 3) remark: loop skipped: multiversioned.
vectorcopy.cc(39): (col. 3) remark: loop was not vectorized: not inner loop.
vectorcopy.cc(40): (col. 3) remark: loop was not vectorized: existence of vector dependence.
vectorcopy.cc(42): (col. 3) remark: LOOP WAS VECTORIZED.
vectorcopy.cc(43): (col. 3) remark: LOOP WAS VECTORIZED.
vectorcopy.cc(43): (col. 3) remark: loop was not vectorized: not inner loop.
vectorcopy.cc(44): (col. 3) remark: loop was not vectorized: existence of vector dependence.
vectorcopy.cc(46): (col. 3) remark: LOOP WAS VECTORIZED.
vectorcopy.cc(47): (col. 3) remark: LOOP WAS VECTORIZED.
vectorcopy.cc(47): (col. 3) remark: loop was not vectorized: not inner loop.
vectorcopy.cc(48): (col. 3) remark: loop was not vectorized: existence of vector dependence.
vectorcopy.cc(7): (col. 3) remark: loop skipped: multiversioned.
vectorcopy.cc(7): (col. 3) remark: loop was not vectorized: not inner loop.
vectorcopy.cc(13): (col. 3) remark: loop was not vectorized: loop was transformed to memset or memcpy.
vectorcopy.cc(18): (col. 3) remark: loop was not vectorized: loop was transformed to memset or memcpy.
vectorcopy.cc(29): (col. 3) remark: loop was not vectorized: existence of vector dependence.
vectorcopy.cc(23): (col. 3) remark: LOOP WAS VECTORIZED.
[avladim@dublin ~]$ ./a.out 
Original array:
0 1 2 3 4 5 6 7 8 9 10 11...
After MyCopyScalar (correct result):
0 0 0 0 0 0 0 0 0 0 0 0...
After MyCopyIvdep:
0 0 0 0 0 4 5 6 6 8 9 10...
After MyCopyRestrict:
0 0 0 0 0 4 5 6 6 8 9 10...
[avladim@dublin ~]$

imagem de Ravi Narayanaswamy (Intel)

ivdep.cc(4): (col. 3) remark: LOOP WAS VECTORIZED.
ivdep.cc(4): (col. 3) remark: loop was not vectorized: not inner loop.

This is an artifact of how the compiler vectorizes the loop.  It creates an outer loop with only one interation and inside this generates alternate code based on the total iterations of the original loop to either use memcpy or a vector loop

So you will have

for (temp=0; temp<1; temp++) {
  if (i < X) 
       memcpy(....)
  else
      vector form of for(int i=0; i<n; i++) .....

The message "remark: loop was not vectorized: not inner loop."  is for the outer dummy loop the compiler has generated and the message "remark: LOOP WAS VECTORIZED" for the actual user loop which is vectorized.

imagem de Andrey Vladimirov

Thanks! So, the comment "not inner loop" with respect to inner loops can be ignored.

Faça login para deixar um comentário.