About shufps instruction

About shufps instruction

Hi all,

I have a question about shufps instructions. So what kind of C code would usually generate shufps by the compiler?

Thank you for your help!

4 帖子 / 0 全新
最新文章
如需更全面地了解编译器优化,请参阅优化注意事项

I'm sure we can't guess your target without hints.  Code which gathers or scatters elements to and from a packed vector under SSE2 code option, possibly with the help of #pragma vector always.  Setting SSE4 options would promote newer instructions for the same purpose.

>>...So what kind of C code would usually generate shufps by the compiler?

I agree with Tim that your question is really hard to answer. So, I've looked at Intel headers with intrinsic functions and here are some details:

immintrin.h
...
/*
* Shuffle Packed Single Precision Floating-Point Values
* **** VSHUFPS ymm1, ymm2, ymm3/m256, imm8
* Moves two of the four packed single-precision floating-point values
* from each double qword of the first source operand into the low
* quadword of each double qword of the destination; moves two of the four
* packed single-precision floating-point values from each double qword of
* the second source operand into to the high quadword of each double qword
* of the destination. The selector operand determines which values are moved
* to the destination.
*/
extern __m256 __ICL_INTRINCC _mm256_shuffle_ps(__m256, __m256, const int);
...

A very generic answer could look like: A C/C++ compiler will generate the instruction if C/C++ code uses _mm256_shuffle_ps intrinsic function, or has inline assembler code for the instruction ( it is assumed that support for generation of AVX instructions is enabled ).

Also, you need to look at Intel Instruction Set Reference Manual ( Volumes 2A, 2B and 2C ) for more detailed decription of the instruction.

Sorry for the confusion. I meant to ask what kind of C code could be possibly translated into shufps by the compiler. I think the problem has been solved. Thank you guys! :-)

发表评论

登录添加评论。还不是成员?立即加入