Reduction function - determines the minimum value and the index of minimum value of array elements

Reduction function - determines the minimum value and the index of minimum value of array elements

With the ICC 11.1 and Windows XP

the first loop can be auto-vectorized with the instruction "pcmpgtd" and logical operations

[cpp] int i = 0; unsigned char u8_src[2048]; int max_val = 0; int min_val = 0; int max_idx = 0; int min_idx = 0; for(i=0;i

While the second loop cannot be auto-vectorized (same under the "#pragma ivdep | vector always").

[cpp] short i = 0; unsigned char u8_src[2048]; short max_val = 0; short min_val = 0; short max_idx = 0; short min_idx = 0; for(i=0;i<2048;i++)//existence of vector dependence { if(u8_src[i]

Using the /Qvec-report3, the compiler says

..\\Tst_if_Min\\main.c(64): (col. 2) remark: loop was not vectorized: existence of vector dependence.

..\\Tst_if_Min\\main.c(71): (col. 3) remark: vector dependence: assumed ANTI dependence between min_val line 71 and min_val line 73.

..\\Tst_if_Min\\main.c(73): (col. 4) remark: vector dependence: assumed FLOW dependence between min_val line 73 and min_val line 71.

1, Why the second loop cannot be auto-vectorized? Is there any idea to help auto-vectorization?

2, Can I simply use the "pcmpgtw" to vectorize the second loop by hand?

7 帖子 / 0 new

Try "int i" instead of "short i". Loop indices should be pure integer, wherever possible. Otherwise the compiler might stumble (at least that's my experience).

Best Olaf

I change the type of "i" into "int" and even add the "#pragma ivdep", but it seems no help to the auto-vectorization of "short" type.

However, when I change the type of "min_val" into "int"and keep the "i"'s type be "short", the auto-vectorization works.

So my question is why theICC11.1 cannot use the PCMPGTWto vectorize the "short" loop, as it using the PCMPGTD to vectorize the "int" loop?

Test Code: the first two loops cannot be auto-vectorzied due to the "short" type of "min_val".

```//test_min_s16_s16_s32
int i = 0;
short s16_src[2048];
short min_val = 0;

for(i=0;i<2048;i++)//existence of vector dependence.
{
if(min_val > s16_src[i])
{
min_val = s16_src[i];
}
}
//test_min_u8_s16_s32
int i = 0;
unsigned char u8_src[2048];
short min_val = 0;

for(i=0;i<2048;i++)//existence of vector dependence
{
if(min_val > u8_src[i])
{
min_val = u8_src[i];
}
}
//test_min_s16_s32_s32
int i = 0; //"short" type also works
short s16_src[2048];
int min_val = 0;

for(i=0;i<2048;i++)//LOOP WAS VECTORIZED.
{
if(min_val > s16_src[i])
{
min_val = s16_src[i];
}
}
//test_min_u8_s32_s32
int i = 0; //"short" type also works
unsigned char u8_src[2048];
int min_val = 0;

for(i=0;i<2048;i++)//LOOP WAS VECTORIZED.
{
if(min_val > u8_src[i])
{
min_val = u8_src[i];
}
}```

What is the build date ofthe icl 11.1 compiler you are using?

Thanks,
--mark

Hi, Mark

I am using the Intel C++ Compiler 11.1.038 on the Windows XP(SP3), the CPU is Core i5 M450.
Can you reproduce my problem?

Thanks,
lychee

Hi Lychee,

I was able to reproduce the problem. Looks like it may be a parsing issue. I filed a report on this issue and will let you know when I get an update. The tracking number is DPD200162929.

--mark

This issue has been resolved in Intel(R) C++ Composer XE 13.0 update 1 available at registrationcenter.intel.com.

Thanks,
--mark