I really do not understand yet the _MM_SHUFFLE macro meaning in the context of AVX _mm256_shuffle_ps() (For SSE, it's Ok).
Any help for this?
the meaning is exactly the same with AVX and SSE
with 256-bit wide AVX shuffles, the high and low 128-bit "lanes" are processed independently (i.e, you can't shuffle high bits with low bits), as with most VEX.256 instructions
depending on your use cases, AVX2 _mm256_permutevar8x32_ps() is maybe a better fit since it can be used to shuffle across the whole 256-bit
Thanks a lot, Bronxzv.
Now I understand clearly this issue.