Developer Guide and Reference



Permutes quadword integer values of the source vector into the destination vector. The corresponding Intel® AVX2 instruction is


extern __m256i _mm256_permute4x64_epi64(__m256i val, const int control);
the vector of 64-bit quadword integer elements to be permuted
an integer specified as an 8-bit immediate
Use two-bit index values in the immediate byte to select a qword integer element from the source vector
. The result element is copied to the corresponding element of destination vector. The intrinsic allows to copy the same element of the source vector to more than one element of the destination vector.
Below is the pseudo-code for the intrinsic:
RESULT[63:0] <- (VAL[255:0] >> (CONTROL[1:0] * 64))[63:0]; RESULT[127:64] <- (VAL[255:0] >> (CONTROL[3:2] * 64))[63:0]; RESULT[191:128] <- (VAL[255:0] >> (CONTROL[5:4] * 64))[63:0]; RESULT[255:192] <- (VAL[255:0] >> (CONTROL[7:6] * 64))[63:0];
Result of the permute operation.

Product and Performance Information


Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804