Diagnostic 15316: loop was not vectorized: not inner loop

Cause:

Always the inner loop is targeted for vectorization and outer loop is targeted for parallelization. Below is an example for this scenario.

Example:

#include<iostream>
#define N 25
int main(){
int a[N][N], b[N], i;
for(int j = 0; j < N; j++)
{
        for(int i = 0; i < N; i++)
                a[j][i] = 0;
        b[j] = 1;
}
int sum = __sec_reduce_add(a[:][:]) + __sec_reduce_add(b[:]);
return 0;
}


$ icpc example7.cc -vec-report2
example7.cc(7): (col. 2) remark: loop was not vectorized: loop was transformed to memset or memcpy
example7.cc(5): (col. 1) remark: loop was not vectorized: not inner loop

Resolution Status:

Add the following pragma ("#pragma omp simd collapse(2)") before the outer for loop and compiler with -openmp compiler option. The collapse(2) explicitly states the compiler to collapse the 2 loops into 1 for vectorization. Doing the above will produce the following vectorization report:
remark: OpenMP SIMD LOOP WAS VECTORIZED

For more complete information about compiler optimizations, see our Optimization Notice.