Developer Guide and Reference

Contents

_mm256_dp_ps

Calculates the dot product of float32 vectors. The corresponding Intel® AVX instruction is
VDPPS
.

Syntax

extern __m256 _mm256_dp_ps(__m256 m1, __m256 m2, const int mask);
Arguments
m1
float32 vector used for the operation
m2
float32 vector also used for the operation
mask
a constant of integer type where the high four bits of the mask determine how the resultant elements are summed and the low four bits determine whether the summed resultant value is to be broadcast to the destination vector or not
Description
First performs a SIMD multiplication of the lower four packed single-precision floating-point elements (float32 elements) from the first source vector
m1
with corresponding elements in the second source vector
m2
.
Each of the four resulting single-precision elements is conditionally summed depending on the high four bits in the
mask
parameter.
The resulting summed value is broadcast to each of the lower 4 positions in the destination vector, if the corresponding lower bit of the
mask
is "1". If the corresponding lower bit of the
mask
is zero, the corresponding lower element in the destination vector is set to zero.
The process is then replicated with the high elements of the source vectors.
Returns
Result of the operation.