Developer Guide and Reference

Contents

Intrinsics for Integer Load and Store Operations

The prototypes for Intel® Advanced Vector Extensions 512 (Intel® AVX-512) intrinsics are located in the
zmmintrin.h
header file.
To use these intrinsics, include the
immintrin.h
file as follows:
#include <immintrin.h>
Intrinsic Name
Operation
Corresponding
Intel® AVX-512 Instruction
_mm512_load_epi32
,
_mm512_mask_load_epi32
,
_mm512_maskz_load_epi32
Load packed int32 elements from memory
VMOVDQA32
_mm512_load_epi64
,
_mm512_mask_load_epi64
,
_mm512_maskz_load_epi64
Load packed int64 elements from memory
VMOVDQA64
_mm512_loadu_si512
Unaligned load of 512-bit scalar integer
VMOVDQU32
_mm512_mask_loadu_epi32
,
_mm512_maskz_loadu_epi32
Unaligned load of packed int32 elements
VMOVDQU32
_mm512_mask_loadu_epi64
,
_mm512_maskz_loadu_epi64
Unaligned load of packed int64 elements
VMOVDQU64
_mm512_stream_load_si512
Load double quadword using non-temporal aligned hint.
MOVNTDQA
_mm512_mask_storeu_epi64
Store unaligned packed int64 elements
VMOVDQU64
_mm512_stream_si512
Store packed integer values using non-temporal hint.
VMOVNTDQA
variable
definition
k
writemask used as a selector
a
first source vector element
mem_addr
pointer to base address in memory
src
source element to use based on writemask result
_mm512_load_si512
extern __m512i __cdecl _mm512_load_si512(void const* mem_addr);
Load 512-bits of integer data from memory into destination.
mem_addr
must be aligned on a 64-byte boundary or a general-protection exception will be generated.
_mm512_loadu_si512
extern __m512i __cdecl _mm512_loadu_si512(void const* mem_addr);
Load 512-bits of integer data from memory into destination.
mem_addr
does not need to be aligned on any particular boundary.
_mm512_load_epi32
extern __m512i __cdecl _mm512_load_epi32(void const* mem_addr);
Load 512-bits (composed of sixteen packed 32-bit integers) from memory into destination.
mem_addr
must be aligned on a 64-byte boundary or a general-protection exception will be generated.
_mm512_mask_load_epi32
extern __m512i __cdecl _mm512_mask_load_epi32(__m512i src, __mmask16 k, void const* mem_addr);
Load packed int32 elements from memory into destination using writemask
k
(elements are copied from
src
when the corresponding mask bit is not set).
mem_addr
must be aligned on a 64-byte boundary or a general-protection exception will be generated.
_mm512_maskz_load_epi32
extern __m512i __cdecl _mm512_maskz_load_epi32(__mmask16 k, void const* mem_addr);
Load packed int32 elements from memory into destination using zeromask
k
(elements are zeroed out when the corresponding mask bit is not set).
mem_addr
must be aligned on a 64-byte boundary or a general-protection exception will be generated.
_mm512_load_epi64
extern __m512i __cdecl _mm512_load_epi64(void const* mem_addr);
Load 512-bits (composed of eight packed int64 elements ) from memory into destination.
mem_addr
must be aligned on a 64-byte boundary or a general-protection exception will be generated.