_mm512_i32[ext]gather_epi32/ _mm512_mask_i32[ext]gather_epi32

Gather int32 vector with int32 indices. Corresponding instruction is VPGATHERDD. This intrinsic only applies to Intel® Many Integrated Core Architecture (Intel® MIC Architecture).

Syntax

Without Mask

extern __m512i __cdecl _mm512_i32extgather_epi32(_m512i index, void const* mv, _MM_UPCONV_EPI32_ENUM conv, int scale, int hint);

extern __m512i __cdecl _mm512_i32gather_epi32(_m512i index, void const* mv, _MM_UPCONV_EPI32_NONE, int scale, _MM_HINT_NONE);

With Mask

extern __m512i __cdecl _mm512_mask_i32extgather_epi32(_m512 v1_old, __mmask16 k1, __m512i index, void const* mv, _MM_UPCONV_EPI32_ENUM conv, int scale, int hint);

extern __m512i __cdecl _mm512_mask_i32gather_epi32(_m512 v1_old, __mmask16 k1, __m512i index, void const* mv, _MM_UPCONV_EPI32_NONE, int scale, _MM_HINT_NONE);

Parameters

v1_old

Source vector that retains old values of the destination vector; the resulting vector gets corresponding elements from v1_old for zero mask bits

k1

Writemask; only those elements of the source vectors with corresponding bit set to '1' in the k1 mask are computed and stored in the result; elements in the result vector corresponding to zero bit in k1 are copied from corresponding elements of vector v1_old

index

int32 vector containing indexes in memory mv

mv

Pointer to base address in memory

conv

Type of upconversion, which can be one of the following:

  • _MM_UPCONV_EPI32_NONE - no conversion
  • _MM_UPCONV_EPI32_UINT8 - uint8 => uint32
  • _MM_UPCONV_EPI32_SINT8 - sint8 => sint32
  • _MM_UPCONV_EPI32_UINT16 - uint16 => uint32
  • _MM_UPCONV_EPI32_SINT16 - sint16 => sint32

scale

Scaling factor for calculating address of elements. Takes following values: 1, 2, 4, and 8. The address of the i-th element in memory is calculated as: mv + index[i] * scale

hint

Hint that indicates to the processor that the data is non-temporal. Takes the value 0 or 1, where:

  • _MM_HINT_NONE = 0
  • _MM_HINT_NT = 1 (Load is non-temporal)

Description

Up-converts a set of 16 memory locations pointed by base address mv and int32 index vector index with scale scale, and gathers them into a int32 vector.

The resulting vector for the masked variant is populated by elements for which the corresponding bit in the writemask vector k1 is set. The remaining elements of the resulting vector for the masked variant is populated by corresponding elements from v1_old.

The non-masked variant of the intrinsic is equivalent to the masked variant with full mask (k1=0xffff).

Note

These intrinsics do not have broadcast support.

Returns

Returns the result of the up-convert load operation.

For more complete information about compiler optimizations, see our Optimization Notice.