bitwise epi32 vs epi64

bitwise epi32 vs epi64

For bitwise operations there should not be any difference between using different kind of vector type as input.

Then why is there a different intrinsics for epi32 and epi64? And when choosing one of them will make any difference?

2 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.

In looking back over some old forum issues that didn't get addressed, I came across this one and was curious. So, I looked in the Intel® Xeon Phi™ Coprocessor Instruction Set Architecture Reference Manual and found that the format of the underlying instructions are like the following one for xor:

vpxord zmm1 {k1}, zmm2, Si32(zmm3/mt) 

The underlying instruction allows for masking of elements and swizzle of the zmm3 vector, which means you need separate instructions for 32 and 64 bits. In the intrinsics, there are intrinsics for when the mask and swizzle are used and for those cases where no mask or swizzle is used but both intrinsics still need to map back to the same underlying instruction. There really isn't any benefit in having an intrinsic that doesn't specify element size.

Leave a Comment

Please sign in to add a comment. Not a member? Join today