# Problem with _mm256_and_ps instruction

## Problem with _mm256_and_ps instruction

Hi,

I was trying to do a very simple exercise using vector instructions. But I am getting wrong results.

In the following program I am trying to do a bit-wise-and operation using _mm256_and_ps instruction.
/////////////////////////////////////////////////////
#include <immintrin.h>
#include <iostream>
using namespace std;
int main(){

__m256 x1,x2;
float* x=(float*) _mm_malloc(8*sizeof(float),64);
for(int i=0;i<8;i++)
x[i] = (float)i;
for(int i=0;i<8;i++)
x2 = _mm256_and_ps (x1,x2);
_mm256_store_ps(x,x2);
for(int i=0;i<8;i++)
cout<<x[i]<<endl;
return 0;

///////////////////////////////////////////////////////////
If you run the program, you will see that some of the results of the and operation are wrong.
For example, 1 & 3 = 1 but the result from the program is 0, similarly 6&3 = 2 but its giving 3.

Could anyone explain why that happened? Is this happening because I am using floating point data?

Thanks and best regards,
Jesmin

5 posts / 0 new
For more complete information about compiler optimizations, see our Optimization Notice.

Yes, that is because you are using floating point values. The floating point numbers are stored in the IEEE-754 format, so this is how numbers from 0.0 through 7.0 look in your calculation:

```#include <immintrin.h>

#include <iostream>

using namespace std;

void printbits(float* v) {
unsigned int n = *((unsigned int*)v);
for (int i = 0; i < 32; i++) {
cout << ( (n & bitmask) ? '1' : '0' );
}
}

int main(){
__m256 x1,x2;
float* x=(float*) _mm_malloc(8*sizeof(float),64);
float* xorig=(float*) _mm_malloc(8*sizeof(float),64);
for(int i=0;i<8;i++)  {
x[i] = (float)i;
xorig[i] = x[i];
}
for(int i=0;i<8;i++)

x2 = _mm256_and_ps (x1,x2);
_mm256_store_ps(x,x2);

for(int i=0;i<8;i++) {
cout << "i=" << i << ": ";
printbits(&xorig[i]);
cout << " & ";
cout << " = ";
printbits(&x[i]);
cout << endl;
}
return 0;
}
```

```[cfxuser@c001-n001 ~]\$ ./a.out
i=0: 00000000000000000000000000000000 & 01000000010000000000000000000000 = 00000000000000000000000000000000
i=1: 00111111100000000000000000000000 & 01000000010000000000000000000000 = 00000000000000000000000000000000
i=2: 01000000000000000000000000000000 & 01000000010000000000000000000000 = 01000000000000000000000000000000
i=3: 01000000010000000000000000000000 & 01000000010000000000000000000000 = 01000000010000000000000000000000
i=4: 01000000100000000000000000000000 & 01000000010000000000000000000000 = 01000000000000000000000000000000
i=5: 01000000101000000000000000000000 & 01000000010000000000000000000000 = 01000000000000000000000000000000
i=6: 01000000110000000000000000000000 & 01000000010000000000000000000000 = 01000000010000000000000000000000
i=7: 01000000111000000000000000000000 & 01000000010000000000000000000000 = 01000000010000000000000000000000
[cfxuser@c001-n001 ~]\$```

Jesmin,

A follow-on question is: Do you need to perform something with floating point that is equivalent to logical and with integers (without converting the floats to ints)? If so, you can (hack) add the appropriate power of 2 to align the bits in the mantissa in a position suitable for your AND mask, apply the binary AND, then subtract the appropriate power of 2. Note, the portion of the mantissa you can manipulate for float, is 23 bits.

Jim Dempsey