Developer Guide and Reference

Contents

Intrinsics for Intel® Advanced Vector Extensions 512 (Intel® AVX-512) 4FMAPS Instructions

The prototypes for Intel® Advanced Vector Extensions 512 (Intel® AVX-512) 4FMAPS instruction intrinsics are located in the zmmintrin.h header file.
To use these intrinsics, include the
immintrin.h
file as follows:
#include <immintrin.h>
_mm512_4fmadd_ps
__mm512i _mm512_4fmadd_ps (__m512 c, __m512 a0, __m512 a1, __m512 a2, __m512 a3, __m128 * b)
variable
definition
a
n
first source block 4 vectors
b
pointer to the second source block
c
third source; accumulator
Instructions: v4fmaddps zmm1, zmm2+3, m128
Multiplies packed single-precision floating-point values from source register block {
a
0,
a
1,
a
2,
a
3} by floating-point values pointed to by
b
and accumulates the result in
c
.
_mm512_mask_4fmadd_ps
__mm512i _mm512_mask_4fmadd_ps (__m512 c, __mmask16 k, __m512 a0, __m512 a1, __m512 a2, __m512 a3, __m128 * b)
variable
definition
a
n
first source block 4 vectors
b
pointer to the second source block
c
third source; accumulator
k
mask used as a selector
Instructions: v4fmaddps zmm1 {k}, zmm2+3, m128
Multiplies packed single-precision floating-point values from source register block {
a
0,
a
1,
a
2,
a
3} using mask
k
by floating-point values pointed to by
b
and accumulates the result in
c
. Elements are copied from
c
when the corresponding mask bit is not set.
_mm512_maskz_4fmadd_ps
__mm512i _mm512 _maskz_4fmadd_ps (__m512 c, __mmask16 k, __m512 a0, __m512 a1, __m512 a2, __m512 a3, __m128 * b)
variable
definition
a
n
first source block 4 vectors
b
pointer to the second source block
c
third source; accumulator
k
mask used as a selector
Instructions: v4fmaddps zmm {k}, zmm+3, m128
Multiplies packed single-precision floating-point values from source register block {
a
0,
a
1,
a
2,
a
3} using mask
k
by floating-point values pointed to by
b
and accumulates the result in
c
. Elements are zeroed out when the corresponding mask bit is not set.
_mm512_4fnmadd_ps
__mm512i _mm512_4fnmadd_ps (__m512 c, __m512 a0, __m512 a1, __m512 a2, __m512 a3, __m128 * b)
variable
definition
a
n
first source block 4 vectors
b
pointer to the second source block
c
third source; accumulator
Instructions: v4fnmaddps zmm1, zmm2+3, m128
Multiplies and negates packed single-precision floating-point values from source register block {
a
0,
a
1,
a
2,
a
3} by floating-point values pointed to by
b
and accumulates the result in
c
.
_mm512_mask_4fnmadd_ps
__mm512i _mm512_mask_4fnmadd_ps (__m512 c, __mmask16 k, __m512 a0, __m512 a1, __m512 a2, __m512 a3, __m128 * b)
variable
definition
a
n
first source block 4 vectors
b
pointer to the second source block
c
third source; accumulator
k
mask used as a selector
Instructions: v4fnmaddps zmm1 {k}, zmm2+3, m128
Multiplies and negates packed single-precision floating-point values from source register block {
a
0,
a
1,
a
2,
a
3} using mask
k
by floating-point values pointed to by
b
and accumulates the result in
c
. Elements are copied from
c
when the corresponding mask bit is not set.
_mm512_maskz_4fnmadd_ps
__mm512i _mm512_maskz_4fnmadd_ps (__m512 c, __mmask16 k, __m512 a0, __m512 a1, __m512 a2, __m512 a3, __m128 * b)