the first loop can be auto-vectorized with the instruction "pcmpgtd" and logical operations
[cpp] int i = 0; unsigned char u8_src[2048]; int max_val = 0; int min_val = 0; int max_idx = 0; int min_idx = 0; for(i=0;i
While the second loop cannot be auto-vectorized (same under the "#pragma ivdep | vector always").
[cpp] short i = 0; unsigned char u8_src[2048]; short max_val = 0; short min_val = 0; short max_idx = 0; short min_idx = 0; for(i=0;i<2048;i++)//existence of vector dependence { if(u8_src[i]
Using the /Qvec-report3, the compiler says
..\\Tst_if_Min\\main.c(64): (col. 2) remark: loop was not vectorized: existence of vector dependence.
..\\Tst_if_Min\\main.c(71): (col. 3) remark: vector dependence: assumed ANTI dependence between min_val line 71 and min_val line 73.
..\\Tst_if_Min\\main.c(73): (col. 4) remark: vector dependence: assumed FLOW dependence between min_val line 73 and min_val line 71.
1, Why the second loop cannot be auto-vectorized? Is there any idea to help auto-vectorization?
2, Can I simply use the "pcmpgtw" to vectorize the second loop by hand?
Try "int i" instead of "short i". Loop indices should be pure integer, wherever possible. Otherwise the compiler might stumble (at least that's my experience).
I change the type of "i" into "int" and even add the "#pragma ivdep", but it seems no help to the auto-vectorization of "short" type.
However, when I change the type of "min_val" into "int"and keep the "i"'s type be "short", the auto-vectorization works.
So my question is why theICC11.1 cannot use the PCMPGTWto vectorize the "short" loop, as it using the PCMPGTD to vectorize the "int" loop?
Test Code: the first two loops cannot be auto-vectorzied due to the "short" type of "min_val".
//test_min_s16_s16_s32
int i = 0;
short s16_src[2048];
short min_val = 0;
for(i=0;i<2048;i++)//existence of vector dependence.
{
if(min_val > s16_src[i])
{
min_val = s16_src[i];
}
}
//test_min_u8_s16_s32
int i = 0;
unsigned char u8_src[2048];
short min_val = 0;
for(i=0;i<2048;i++)//existence of vector dependence
{
if(min_val > u8_src[i])
{
min_val = u8_src[i];
}
}
//test_min_s16_s32_s32
int i = 0; //"short" type also works
short s16_src[2048];
int min_val = 0;
for(i=0;i<2048;i++)//LOOP WAS VECTORIZED.
{
if(min_val > s16_src[i])
{
min_val = s16_src[i];
}
}
//test_min_u8_s32_s32
int i = 0; //"short" type also works
unsigned char u8_src[2048];
int min_val = 0;
for(i=0;i<2048;i++)//LOOP WAS VECTORIZED.
{
if(min_val > u8_src[i])
{
min_val = u8_src[i];
}
}
I was able to reproduce the problem. Looks like it may be a parsing issue. I filed a report on this issue and will let you know when I get an update. The tracking number is DPD200162929.
Reduction function - determines the minimum value and the index of minimum value of array elements
With the ICC 11.1 and Windows XP
the first loop can be auto-vectorized with the instruction "pcmpgtd" and logical operations
[cpp] int i = 0; unsigned char u8_src[2048]; int max_val = 0; int min_val = 0; int max_idx = 0; int min_idx = 0; for(i=0;i
While the second loop cannot be auto-vectorized (same under the "#pragma ivdep | vector always").
[cpp] short i = 0; unsigned char u8_src[2048]; short max_val = 0; short min_val = 0; short max_idx = 0; short min_idx = 0; for(i=0;i<2048;i++)//existence of vector dependence { if(u8_src[i]
Using the /Qvec-report3, the compiler says
..\\Tst_if_Min\\main.c(64): (col. 2) remark: loop was not vectorized: existence of vector dependence.
..\\Tst_if_Min\\main.c(71): (col. 3) remark: vector dependence: assumed ANTI dependence between min_val line 71 and min_val line 73.
..\\Tst_if_Min\\main.c(73): (col. 4) remark: vector dependence: assumed FLOW dependence between min_val line 73 and min_val line 71.
1, Why the second loop cannot be auto-vectorized? Is there any idea to help auto-vectorization?
2, Can I simply use the "pcmpgtw" to vectorize the second loop by hand?