Developer Guide and Reference

Contents

C++ Classes and SIMD Operations

Use of C++ classes for SIMD operations allows for operating on arrays or vectors of data in a single operation. Consider the addition of two vectors,
A
and
B
, where each vector contains four elements. Using an integer vector class, the elements
A[i]
and
B[i]
from each array are summed as shown in the following example.

Typical Method of Adding Elements Using a Loop

int a[4], b[4], c[4]; for (i=0; i<4; i++) /* needs four iterations */ c[i] = a[i] + b[i]; /* computes c[0], c[1], c[2], c[3] */
The following example shows the same results using one operation with an integer class.

SIMD Method of Adding Elements Using Ivec Classes

Is16vec4 ivecA, ivecB, ivec C; /*needs one iteration*/ ivecC = ivecA + ivecB; /*computes ivecC0, ivecC1, ivecC2, ivecC3 */

Available Classes

The Intel® C++ SIMD classes provide parallelism, which is not easily implemented using typical mechanisms of C++. The following table shows how the Intel® C++ classes use the SIMD classes and libraries.
SIMD Vector Classes
Instruction Set
Class
Signedness
Data Type
Size
Elements
Header File
MMX™ technology
I64vec1
unspecified
__m64
64
1
ivec.h
 
I32vec2
unspecified
int
32
2
ivec.h
 
Is32vec2
signed
int
32
2
ivec.h
 
Iu32vec2
unsigned
int
32
2
ivec.h
 
I16vec4
unspecified
short
16
4
ivec.h
 
Is16vec4
signed
short
16
4
ivec.h
 
Iu16vec4
unsigned
short
16
4
ivec.h
 
I8vec8
unspecified
char
8
8
ivec.h
 
Is8vec8
signed
char
8
8
ivec.h
 
Iu8vec8
unsigned
char
8
8
ivec.h
Intel® SSE
F32vec4
unspecified
float
32
4
fvec.h
 
F32vec1
unspecified
float
32
1
fvec.h
Intel® SSE2
F64vec2
unspecified
double
64
2
dvec.h
 
I128vec1
unspecified
__m128i
128
1
dvec.h
 
I64vec2
unspecified
long int
64
2
dvec.h
 
I32vec4
unspecified
int
32
4
dvec.h
 
Is32vec4
signed
int
32
4
dvec.h
 
Iu32vec4
unsigned
int
32
4
dvec.h
 
I16vec8
unspecified
int
16
8
dvec.h
 
Is16vec8
signed
int
16
8
dvec.h
 
Iu16vec8
unsigned
int
16
8
dvec.h
 
I8vec16
unspecified
char
8
16
dvec.h
 
Is8vec16
signed
char
8
16
dvec.h
 
Iu8vec16
unsigned
char
8
16
dvec.h
Intel® AVX
F32vec8
unspecified
float
32
8
dvec.h
F64vec4
unspecified
double
64
4
dvec.h
Intel® AVX-512 Foundation
F32vec16
unspecified
float
32
16
dvec.h
F64vec8
unspecified
double
64
8
dvec.h
M512vec
unspecified
__m512i
512
1
dvec.h
I32vec16
unspecified
int
32
16
dvec.h
Is32vec16
signed
int
32
16
dvec.h
Iu32vec16
unsigned
int
32
16
dvec.h
I64vec8
unspecified
long int
64
8
dvec.h
Is64vec8
signed
long int
64
8
dvec.h
Iu64vec8
unsigned
long int
64
8
dvec.h
Intel® AVX-512 Byte and Word
I16vec32
unspecified
int
16
32
dvec.h
Is16vec32
signed
int
16
32
dvec.h
Iu16vec32
unsigned
int
16
32
dvec.h
I8vec64
unspecified
int
8
64
dvec.h
Is8vec64
signed
int
8
64
dvec.h
Iu8vec64
unsigned
int
8
64
dvec.h
Most classes contain similar functionality for all data types and are represented by all available intrinsics. However, some capabilities do not translate from one data type to another without suffering from poor performance, and are therefore excluded from individual classes.
Intrinsics that take immediate values and cannot be expressed easily in classes are not implemented. For example:
  • _mm_shuffle_ps
  • _mm_shuffle_pi16
  • _mm_shuffle_ps
  • _mm_extract_pi16
  • _mm_insert_pi16

Access to Classes Using Header Files

The required class header files are installed in the include directory with the
Intel® oneAPI
DPC++/C++
Compiler
. To enable the classes, use the
#include
directive in your program file as shown in the table that follows.
Include Directives for Enabling Classes
Instruction Set Extension
Include Directive
MMX™ Technology
#include <
ivec.h
>
Intel® SSE
#include <
fvec.h
>
Intel® SSE 2
#include <
dvec.h
>
Intel® SSE 3
#include <
dvec.h
>
Intel® SSE 4
#include <
dvec.h
>
Intel® AVX
#include <
dvec.h
>
Each succeeding file from the top down includes the preceding class. You only need to include
fvec.h
if you want to use both the
Ivec
and
Fvec
classes. Similarly, to use all the classes including those for Intel® Streaming SIMD Extensions 2, you only need to include the
dvec.h
file.

Usage Precautions

When using the C++ classes, you should follow some general guidelines. More detailed usage rules for each class are listed in Integer Vector Classes, and Floating-point Vector Classes.
Clear MMX Registers
If you use both the
Ivec
and
Fvec
classes at the same time, your program could mix Intel® MMX™ instructions, called by
Ivec
classes, with Intel® architecture floating-point instructions, called by
Fvec
classes. x87 floating-point instructions exist in the following
Fvec
functions:
  • fvec
    constructors
  • debug functions (
    cout
    and element access)
  • rsqrt_nr
Intel® MMX™ technology registers are aliased on the floating-point registers, so you should clear the MMX state with the EMMS instruction intrinsic before issuing an x87 floating-point instruction, as in the following example.
ivecA = ivecA & ivecB;
Ivec logical operation that uses MMX instructions
empty ();
clear state
cout << f32vec4a;
F32vec4 operation that uses x87 floating-point instructions
Failure to clear the Intel® MMX™ technology registers can result in incorrect execution or poor performance due to an incorrect register state.

Product and Performance Information

1

Performance varies by use, configuration and other factors. Learn more at www.Intel.com/PerformanceIndex.