Problems with the unexpected behavior of vshufpd

Problems with the unexpected behavior of vshufpd

Hello there.

I wrote this little piece of code:
#include <immintrin.h>
#include <stdio.h>

typedef struct quaternion {
    double a;
    double i;
    double j;
    double k;
} quaternion;

int main(void) {
    struct quaternion n1 = (struct quaternion){3.141592, -2.72192, -6.28384, -9.478};

    __m256d n1_t = *(__m256d *)&n1;

                            // 0b00011011
    n1_t = _mm256_shuffle_pd(n1_t, n1_t, _MM_SHUFFLE(0,1,2,3));
    *(__m256d *)&n1 = n1_t;

    printf("\t%lf %lf %lf %lf\n", n1.a, n1.i, n1.j, n1.k);

    return 0;
P.S. Explanation of what should be in place a mask I have not found, so as to use SSE* way.

I expected see this result: 3.141592 -2.721920 -6.283840 -9.478000

But i got this: -2.721920 -2.721920 -6.283840 -9.478000

I have only two possible reasons:
1) The mask is defined in another way
2) Error in Intel® Software Development Emulator

2 posts / 0 new
Last post
For more complete information about compiler optimizations, see our Optimization Notice.


The _mm256_shuffle_pd intrinsic is translated to the VSHUFPD instruction.
The _MM_SHUFFLE macro is intended for use by the SHUFPS instruction and not the SHUFPD.
To use it correctly you need to generate the shuffle selector by hand.
The description of the selector byte is in the SDM Vol 2 in the description of the SHUFPD instruction.

In your example the selector should be 0xa (1010b).


Leave a Comment

Please sign in to add a comment. Not a member? Join today