Developer Guide and Reference

Contents

Intrinsics for Store Operations

The prototypes for Intel® Advanced Vector Extensions 512 (Intel® AVX-512) intrinsics are located in the
zmmintrin.h
header file.
To use these intrinsics, include the
immintrin.h
file as follows:
#include <immintrin.h>
variable
definition
base_addr
pointer to base address in memory to begin load or store operation
mem_addr
pointer to base address in memory
k
writemask used as a selector
a
first source vector element
_mm_mask_compressstoreu_pd
void _mm_mask_compressstoreu_pd(void* base_addr, __mmask8 k, __m128d a)
CPUID Flags: AVX512F, AVX512VL
Instruction(s): vcompresspd
Contiguously store the active double-precision (64-bit) floating-point elements in
a
(those with their respective bit set in writemask
k
) to unaligned memory at
base_addr
.
_mm256_mask_compressstoreu_pd
void _mm256_mask_compressstoreu_pd(void* base_addr, __mmask8 k, __m256d a)
CPUID Flags: AVX512F, AVX512VL
Instruction(s): vcompresspd
Contiguously store the active double-precision (64-bit) floating-point elements in
a
(those with their respective bit set in writemask
k
) to unaligned memory at
base_addr
.
_mm_mask_compressstoreu_ps
void _mm_mask_compressstoreu_ps(void* base_addr, __mmask8 k, __m128 a)
CPUID Flags: AVX512F, AVX512VL
Instruction(s): vcompressps
Contiguously store the active single-precision (32-bit) floating-point elements in
a
(those with their respective bit set in writemask
k
) to unaligned memory at
base_addr
.
_mm256_mask_compressstoreu_ps
void _mm256_mask_compressstoreu_ps(void* base_addr, __mmask8 k, __m256 a)
CPUID Flags: AVX512F, AVX512VL
Instruction(s): vcompressps
Contiguously store the active single-precision (32-bit) floating-point elements in
a
(those with their respective bit set in writemask
k
) to unaligned memory at
base_addr
.
_mm_mask_store_pd
void _mm_mask_store_pd(void* mem_addr, __mmask8 k, __m128d a)
CPUID Flags: AVX512F, AVX512VL
Instruction(s): vmovapd
Store packed double-precision (64-bit) floating-point elements from
a
into memory using writemask
k
.
mem_addr
must be aligned on a 16-byte boundary or a general-protection exception may be generated.
_mm256_mask_store_pd
void _mm256_mask_store_pd(void* mem_addr, __mmask8 k, __m256d a)
CPUID Flags: AVX512F, AVX512VL
Instruction(s): vmovapd
Store packed double-precision (64-bit) floating-point elements from
a
into memory using writemask
k
.
mem_addr
must be aligned on a 32-byte boundary or a general-protection exception may be generated.
_mm_mask_store_ps
void _mm_mask_store_ps(void* mem_addr, __mmask8 k, __m128 a)
CPUID Flags: AVX512F, AVX512VL
Instruction(s): vmovaps
Store packed single-precision (32-bit) floating-point elements from
a
into memory using writemask
k
.
mem_addr
must be aligned on a 16-byte boundary or a general-protection exception may be generated.
_mm256_mask_store_ps
void _mm256_mask_store_ps(void* mem_addr, __mmask8 k, __m256 a)
CPUID Flags: AVX512F, AVX512VL
Instruction(s): vmovaps
Store packed single-precision (32-bit) floating-point elements from
a
into memory using writemask
k
.
mem_addr
must be aligned on a 32-byte boundary or a general-protection exception may be generated.
_mm_mask_storeu_pd
void _mm_mask_storeu_pd(void* mem_addr, __mmask8 k, __m128d a)
CPUID Flags: AVX512F, AVX512VL
Instruction(s): vmovupd
Store packed double-precision (64-bit) floating-point elements from
a
into memory using writemask
k
.
mem_addr
does not need to be aligned on any particular boundary.
_mm256_mask_storeu_pd
void _mm256_mask_storeu_pd(void* mem_addr, __mmask8 k, __m256d a)
CPUID Flags: AVX512F, AVX512VL
Instruction(s): vmovupd
Store packed double-precision (64-bit) floating-point elements from
a
into memory using writemask
k
.
mem_addr
does not need to be aligned on any particular boundary.
_mm_mask_storeu_ps
void _mm_mask_storeu_ps(void* mem_addr, __mmask8 k, __m128 a)
CPUID Flags: AVX512F, AVX512VL
Instruction(s): vmovups
Store packed single-precision (