Developer Guide and Reference

Contents

Load Intrinsics

The prototypes for Intel® Streaming SIMD Extensions (Intel® SSE) intrinsics for load operations are in the
xmmintrin.h
header file.
To use these intrinsics, include the
immintrin.h
file as follows:
#include <immintrin.h>
The results of each intrinsic operation are placed in a register. This register is illustrated for each intrinsic with R0-R3. R0, R1, R2, and R3 each represent one of the four 32-bit pieces of the result register.
Intrinsic Name
Operation
Corresponding
Intel® SSE Instruction
_mm_loadh_pi
Load high
MOVHPS reg, mem
_mm_loadl_pi
Load low
MOVLPS reg, mem
_mm_load_ss
Load the low value and clear the three high values
MOVSS
_mm_load1_ps
Load one value into all four words
MOVSS + Shuffling
_mm_load_ps
Load four values, address aligned
MOVAPS
_mm_loadu_ps
Load four values, address unaligned
MOVUPS
_mm_loadr_ps
Load four values in reverse
MOVAPS + Shuffling

_mm_loadh_pi

__m128 _mm_loadh_pi(__m128 a, __m64 const *p);
Sets the upper two SP FP values with 64 bits of data loaded from the address
p
; the lower two values are passed through from
a
.
R0
R1
R2
R3
a0
a1
*p0
*p1

_mm_loadl_pi

__m128 _mm_loadl_pi(__m128 a, __m64 const *p);
Sets the lower two SP FP values with 64 bits of data loaded from the address
p
; the upper two values are passed through from
a
.
R0
R1
R2
R3
a0
a1
*p0
*p1
R0
R1
R2
R3
*p0
*p1
a2
a3

_mm_load_ss

__m128 _mm_load_ss(float * p);
Loads a SP FP value into the low word and clears the upper three words.
R0
R1
R2
R3
*p
0.0
0.0
0.0

_mm_load1_ps

__m128 _mm_load1_ps(float * p);
Loads a SP FP value, copying it into all four words.