Developer Guide and Reference

Contents

Intrinsics for Compression Operations

The prototypes for Intel® Advanced Vector Extensions 512 (Intel® AVX-512) intrinsics are located in the
zmmintrin.h
header file.
To use these intrinsics, include the
immintrin.h
file as follows:
#include <immintrin.h>
Intrinsic Name
Operation
Corresponding
Intel® AVX-512 Instruction
_mm512_mask_compress_pd
,
_mm512_maskz_compress_pd
Contiguously store active float32 elements.
VCOMPRESSPD
_mm512_mask_compress_ps
,
_mm512_maskz_compress_ps
Contiguously store active float64 elements.
VCOMPRESSPS
_mm512_mask_compress_epi32
,
_mm512_maskz_compress_epi32
,
_mm512_mask_compressstoreu_epi32
Contiguously store active int32 elements.
VPCOMPRESSD
_mm512_mask_compress_epi64
,
_mm512_maskz_compress_epi64
Contiguously store active int64 elements.
VPCOMPRESSQ
variable
definition
k
writemask used as a selector
a
first source vector element
src
source element to use based on writemask result
base_addr
pointer to base address in memory to begin load or store operation
_mm512_mask_compress_pd
extern __m512d __cdecl _mm512_mask_compress_pd(__m512d a, __mmask8 k, __m512d src);
Contiguously stores the active float64 elements in
a
(those with their respective bit set in writemask
k
) to destination, and passes through the remaining elements from
src
.
_mm512_maskz_compress_pd
extern __m512d __cdecl _mm512_maskz_compress_pd(__mmask8 k, __m512d a);
Contiguously stores the active float64 elements in
a
(those with their respective bit set in zeromask
k
) to destination, and set the remaining elements to zero.
_mm512_mask_compress_ps
extern __m512 __cdecl _mm512_mask_compress_ps(__m512 a, __mmask16 k, __m512 src);
Contiguously stores the active float32 elements in
a
(those with their respective bit set in writemask
k
) to destination, and passes through the remaining elements from
src
.
_mm512_maskz_compress_ps
extern __m512 __cdecl _mm512_maskz_compress_ps(__mmask16 k, __m512 a);
Contiguously stores the active float32 elements in
a
(those with their respective bit set in zeromask
k
) to destination, and set the remaining elements to zero.
_mm512_mask_compressstoreu_pd
extern void __cdecl _mm512_mask_compressstoreu_pd(void* base_addr, __mmask8 k, __m512d a);
Contiguously stores the active float64 elements in
a
(those with their respective bit set in writemask
k
) to unaligned memory at
base_addr
.
_mm512_mask_compressstoreu_ps
extern void __cdecl _mm512_mask_compressstoreu_ps(void* base_addr, __mmask16 k, __m512 a);
Contiguously stores the active float32 elements in
a
(those with their respective bit set in writemask
k
) to unaligned memory at
base_addr
.
_mm512_mask_compress_epi32
extern __m512i __cdecl _mm512_mask_compress_epi32(__m512i a, __mmask16 k, __m512i src);
Contiguously stores the active int32 elements in
a
(those with their respective bit set in writemask
k
) to destination, and passes through the remaining elements from
src
.
_mm512_maskz_compress_epi32
extern __m512i __cdecl _mm512_maskz_compress_epi32(__mmask16 k, __m512i a);
Contiguously stores the active int32 elements in
a
(those with their respective bit set in zeromask
k
) to destination, and set the remaining elements to zero.
_mm512_mask_compress_epi64
extern __m512i __cdecl _mm512_mask_compress_epi64(__m512i a, __mmask8 k, __m512i src);
Contiguously stores the active int64 elements in
a
(those with their respective bit set in writemask
k
) to destination, and passes through the remaining elements from
src
.
_mm512_maskz_compress_epi64
extern __m512i __cdecl _mm512_maskz_compress_epi64(__mmask8 k, __m512i a);
Contiguously stores the active int64 elements in
a
(those with their respective bit set in zeromask
k
) to destination, and set the remaining elements to zero.
_mm512_mask_compressstoreu_epi32
extern void __cdecl _mm512_mask_compressstoreu_epi32(void* base_addr, __mmask16 k, __m512i a);
Contiguously stores the active int32 elements in
a
(those with their respective bit set in writemask