Developer Guide and Reference

Contents

Intrinsics for FP Multiplication Operations

The prototypes for Intel® Advanced Vector Extensions 512 (Intel® AVX-512) intrinsics are located in the
zmmintrin.h
header file.
To use these intrinsics, include the
immintrin.h
file as follows:
#include <immintrin.h>
Intrinsic Name
Operation
Corresponding
Intel® AVX-512 Instruction
_mm_mul_round_sd
,
_mm_mask_mul_round_sd
,
_mm_maskz_mul_round_sd
_mm_mask_mul_sd
,
_mm_maskz_mul_sd
Multiplies rounded vectors.
VMULSD
_mm_mul_round_ss
,
_mm_mask_mul_round_ss
,
_mm_maskz_mul_round_ss
_mm_mask_mul_ss
,
_mm_maskz_mul_ss
Multiplies rounded vectors.
VMULSS
_mm512_mul_round_pd
,
_mm512_mask_mul_round_pd
,
_mm512_maskz_mul_round_pd
_mm512_mul_pd
,
_mm512_mask_mul_pd
,
_mm512_maskz_mul_pd
Multiplies rounded float64 vectors.
VMULPD
_mm512_mul_round_ps
,
_mm512_mask_mul_round_ps
,
_mm512_maskz_mul_round_ps
_mm512_mul_ps
,
_mm512_mask_mul_ps
,
_mm512_maskz_mul_ps
Multiplies rounded float32 vectors.
VMULPS
variable
definition
k
writemask used as a selector
a
first source vector element
b
second source vector element
src
source element to use based on writemask result
round
Rounding control values; these can be one of the following (along with the
sae
suppress all exceptions flag):
  • _MM_FROUND_TO_NEAREST_INT
    - rounds to nearest even
  • _MM_FROUND_TO_NEG_INF
    - rounds to negative infinity
  • _MM_FROUND_TO_POS_INF
    - rounds to positive infinity
  • _MM_FROUND_TO_ZERO
    - rounds to zero
  • _MM_FROUND_CUR_DIRECTION
    - rounds using default from MXCSR register
_mm512_mul_pd
extern __m512d __cdecl _mm512_mul_pd(__m512d a, __m512d b);
Multiplies packed float64 elements in
a
and
b
, stores the result.
_mm512_mask_mul_pd
extern __m512d __cdecl _mm512_mask_mul_pd(__m512d src, __mmask8 k, __m512d a, __m512d b);
Multiplies packed float64 elements in
a
and
b
, stores the result using writemask
k
(elements are copied from
src
when the corresponding mask bit is not set).
_mm512_maskz_mul_pd
extern __m512d __cdecl _mm512_maskz_mul_pd(__mmask8 k, __m512d a, __m512d b);
Multiplies packed float64 elements in
a
and
b
, stores the result using zeromask
k
(elements are zeroed out when the corresponding mask bit is not set).
_mm512_mul_round_pd
extern __m512d __cdecl _mm512_mul_round_pd(__m512d a, __m512d b, int round);
Multiplies packed float64 elements in
a
and
b
, stores the result.
_mm512_mask_mul_round_pd
extern __m512d __cdecl _mm512_mask_mul_round_pd(__m512d src, __mmask8 k, __m512d a, __m512d b, int round);
Multiplies packed float64 elements in
a
and
b
, stores the result using writemask
k
(elements are copied from
src
when the corresponding mask bit is not set).
_mm512_maskz_mul_round_pd
extern __m512d __cdecl _mm512_maskz_mul_round_pd(__mmask8 k, __m512d a, __m512d b, int round);
Multiplies packed float64 elements in
a
and