Get Link
|
Sync TOC
|
<<
|
>>
Search Options:
Search Titles Only
Match All Words
Match Whole Words
Show Results in Tree
Intel® oneAPI DPC++ Compiler Developer Guide and Reference (Beta)
Introduction, Conventions, and Further Information
Feature Requirements
Getting Help and Support
Related Information
Notational Conventions
Compiler Setup
Using the Command Line
Specifying the Location of Compiler Components with setvars
Invoking the Intel® Compiler
Using the Command Line on Windows*
Understanding File Extensions
Using Makefiles to Compile Your Application
Using Compiler Options
Specifying Include Files
Specifying Object Files
Specifying Assembly Files
Using Eclipse*
Adding the Compiler to Eclipse*
Multi-Version Compiler Support
Creating a Simple Project
Creating a New Project
Setting Options for a Project or File
Building a Project
Running a Project
Using Libraries with Eclipse*
Using Microsoft Visual Studio*
Creating a New Project
Build a Project
Changing the Selected Compiler
Selecting a Configuration
Specifying Directory Paths
Using Property Pages
Using Libraries with Microsoft Visual Studio*
Changing the Selected Libraries
Including MPI Support
dialog box Help
Options: Compilers dialog box
Options: Libraries dialog box
Compiler Reference
C/C++ Calling Conventions
Compiler Options
Alphabetical List of Compiler Options
Ways to Display Certain Option Information
Displaying General Option Information From the Command Line
Compiler Option Details
General Rules for Compiler Options
What Appears in the Compiler Option Descriptions
Optimization Options
fbuiltin, Oi
ffunction-sections
foptimize-sibling-calls
GF
O
Od
Ofast
Os
Ot
Ox
Code Generation Options
EH
fasynchronous-unwind-tables
fexceptions
fomit-frame-pointer
Gd
GR
guard
Gv
m64
m80387
march
masm
mintrinsic-promote, Qintrinsic-promote
momit-leaf-frame-pointer
Qcxx-features
Qpatchable-addresses
regcall, Qregcall
Interprocedural Optimization (IPO) Options
ipo, Qipo
Advanced Optimization Options
daal, Qdaal
ffreestanding
fjump-tables
ipp, Qipp
ipp-link, Qipp-link
mkl, Qmkl
tbb, Qtbb
unroll
Offload Compilation Options
device-math-lib
fintelfpga
foffload-static-lib
fsycl
fsycl-add-targets
fsycl-device-only
fsycl-link
fsycl-link-targets
fsycl-unnamed-lambda
fsycl-targets
fsycl-use-bitcode
Xs
Xsycl-target
Inlining Options
fgnu89-inline
finline
finline-functions
Output, Debug, and Precompiled Header (PCH) Options
c
Fa
FA
fasm-blocks
FC
Fd
FD
Fe
Fo
Fp
ftrapuv
fverbose-asm
g
gdwarf
Gm
grecord-gcc-switches
gsplit-dwarf
o
print-multi-lib
RTC
S
use-msasm
Y-
Yc
Yu
Zi, Z7, ZI
Preprocessor Options
B
C
D
dD, QdD
dM, QdM
E
EP
FI
H, QH
I
I-
idirafter
imacros
iprefix
iquote
isystem
iwithprefix
iwithprefixbefore
Kc++, TP
M, QM
MD, QMD
MF, QMF
MG, QMG
MM, QMM
MMD, QMMD
MP (Linux* OS)
MQ
MT, QMT
nostdinc++
P
U
undef
X
Language Options
ansi
fno-gnu-keywords
fno-operator-names
fno-rtti
fpermissive
fshort-enums
fsyntax-only
funsigned-char
J
std
vd
vmg
x (type option)
Zc
Zp
Zs
Data Options
fcommon
fkeep-static-consts
fmath-errno
fpack-struct
fpic
fpie
fstack-protector
fzero-initialized-in-bss, Qzero-initialized-in-bss
GA
Gs
GS
mcmodel
Compiler Diagnostic Options
w
w0...w5, W0...W5
Wabi
Wall
Wcomment
Wdeprecated
Weffc++, Qeffc++
Werror, WX
Werror-all
Wextra-tokens
Wformat
Wformat-security
Wmain
Wmissing-declarations
Wmissing-prototypes
Wpointer-arith
Wreorder
Wreturn-type
Wshadow
Wsign-compare
Wstrict-aliasing
Wstrict-prototypes
Wtrigraphs
Wuninitialized
Wunknown-pragmas
Wunused-function
Wunused-variable
Wwrite-strings
Compatibility Options
vmv
Linking or Linker Options
fuse-ld
l
L
LD
link
MD
MT
nodefaultlibs
nostartfiles
nostdlib
pie
pthread
shared
shared-libgcc
static
static-libgcc
static-libstdc++
T
u (Linux* OS)
v
Wa
Wl
Wp
Xlinker
Zl
Miscellaneous Options
dumpmachine
dumpversion
Gy
help
nologo
save-temps
showIncludes
sysroot
Tc
TC
Tp
version
Related Options
Portability Options
GCC-Compatible Warning Options
Floating-point Operations
Understanding Floating-Point Operations
Denormal Numbers
Setting the FTZ and DAZ Flags
Tuning Performance
Overview: Tuning Performance
Handling Floating-point Array Operations in a Loop Body
Reducing the Impact of Denormal Exceptions
Avoiding Mixed Data Type Arithmetic Expressions
Using Efficient Data Types
Understanding IEEE Floating-Point Operations
Special Values
Attributes
align
align_value
concurrency_safe
const
cpu_dispatch, cpu_specific
mpx
Intrinsics
Details about Intrinsics
Naming and Usage Syntax
References
Intrinsics for All Intel Architectures
Overview: Intrinsics across Intel® Architectures
Integer Arithmetic Intrinsics
Floating-point Intrinsics
String and Block Copy Intrinsics
Miscellaneous Intrinsics
_may_i_use_cpu_feature
Data Alignment, Memory Allocation Intrinsics, and Inline Assembly
Overview
Alignment Support
Allocating and Freeing Aligned Memory Blocks
Inline Assembly
Intrinsics for Managing Extended Processor States and Registers
Overview
Intrinsics for Reading and Writing the Content of Extended Control Registers
_xgetbv()
_xsetbv()
Intrinsics for Saving and Restoring the Extended Processor States
_fxsave()
_fxsave64()
_fxrstor()
_fxrstor64()
_xsave()
_xsave64()
_xsaveopt()
_xsaveopt64()
_xrstor()
_xrstor64()
Intrinsics for the Short Vector Random Number Generator Library
Data Types and Calling Conventions
Usage Model
Engine Initialization and Finalization
svrng_new_rand0_engine/svrng_new_rand0_ex
svrng_new_rand_engine/svrng_new_rand_ex
svrng_new_mcg31m1_engine/svrng_new_mcg31m1_ex
svrng_new_mcg59_engine/svrng_new_mcg59_ex
svrng_new_mt19937_engine/svrng_new_mt19937_ex
svrng_delete_engine
Distribution Initialization and Finalization
svrng_new_uniform_distribution_[int|float|double]/svrng_update_uniform_distribution_[int|float|double]
svrng_new_normal_distribution_[float|double]/svrng_update_normal_distribution_[float|double]
svrng_delete_distribution
Random Value Generation
svrng_generate[1|2|4|8|16|32]_[uint|ulong]
svrng_generate[1|2|4|8|16|32]_[int|float|double]
Service Routines
Parallel Computation Support
svrng_copy_engine
svrng_skipahead_engine
svrng_leapfrog_engine
Error Handling
svrng_set_status
svrng_get_status
Intrinsics for Intel® Advanced Vector Extensions 512 (Intel® AVX-512) BF16 Instructions
Intrinsics for Intel® Advanced Vector Extensions 512 (Intel® AVX-512) VPOPCNTDQ Instructions
Intrinsics for Intel® Advanced Vector Extensions 512 (Intel® AVX-512) BW, DQ, and VL Instructions
Intrinsics for Arithmetic Operations
Intrinsics for Bit Manipulation Operations
Intrinsics for Comparison Operations
Intrinsics for Conversion Operations
Intrinsics for Load Operations
Intrinsics for Logical Operations
Intrinsics for Miscellaneous Operations
Intrinsics for Move Operations
Intrinsics for Set Operations
Intrinsics for Shift Operations
Intrinsics for Store Operations
Intrinsics for Intel® Advanced Vector Extensions 512 (Intel® AVX-512) Instructions
Overview: Intrinsics for Intel® Advanced Vector Extensions 512 (Intel® AVX-512) Instructions
Intrinsics for Arithmetic Operations
Intrinsics for Addition Operations
Intrinsics for FP Addition Operations
Intrinsics for Integer Addition Operations
Intrinsics for Determining Minimum and Maximum Values
Intrinsics for Determining Minimum and Maximum FP Values
Intrinsics for Determining Minimum and Maximum Integer Values
Intrinsics for FP Fused Multiply-Add (FMA) Operations
Intrinsics for Multiplication Operations
Intrinsics for FP Multiplication Operations
Intrinsics for Integer Multiplication Operations
Intrinsics for Subtraction Operations
Intrinsics for FP Subtraction Operations
Intrinsics for Integer Subtraction Operations
Intrinsics for Short Vector Math Library (SVML) Operations
Intrinsics for Division Operations (512-bit)
Intrinsics for Error Function Operations (512-bit)
Intrinsics for Exponential Operations (512-bit)
Intrinsics for Logarithmic Operations (512-bit)
Intrinsics for Reciprocal Operations (512-bit)
Intrinsics for Root Function Operations (512-bit)
Intrinsics for Rounding Operations (512-bit)
Intrinsics for Trigonometric Operations (512-bit)
Intrinsics for Other Mathematics Operations
Intrinsics for FP Division Operations
Intrinsics for Absolute Value Operations
Intrinsics for Scale Operations
Intrinsics for Blend Operations
Intrinsics for Integer Bit Manipulation Operations
Intrinsics for Bit Manipulation and Conflict Detection Operations
Intrinsics for Bitwise Logical Operations
Intrinsics for Integer Bit Rotation Operations
Intrinsics for Integer Bit Shift Operations
Intrinsics for Broadcast Operations
Intrinsics for FP Broadcast Operations
Intrinsics for Integer Broadcast Operations
Intrinsics for Comparison Operations
Intrinsics for FP Comparison Operations
Intrinsics for Integer Comparison Operations
Intrinsics for Compression Operations
Intrinsics for Conversion Operations
Intrinsics for FP Conversion Operations
Intrinsics for Integer Conversion Operations
Intrinsics for Expand and Load Operations
Intrinsics for FP Expand and Load Operations
Intrinsics for Integer Expand and Load Operations
Intrinsics for Gather and Scatter Operations
Intrinsics for FP Gather and Scatter Operations
Intrinsics for Integer Gather and Scatter Operations
Intrinsics for Insert and Extract Operations
Intrinsics for FP Insert and Extract Operations
Intrinsics for Integer Insert and Extract Operations
Intrinsics for Load and Store Operations
Intrinsics for FP Load and Store Operations
Intrinsics for Integer Load and Store Operations
Intrinsics for Miscellaneous Operations
Intrinsics for Miscellaneous FP Operations
Intrinsics for Miscellaneous Integer Operations
Intrinsics for Move Operations
Intrinsics for FP Move Operations
Intrinsics for Integer Move Operations
Intrinsics for Pack and Unpack Operations
Intrinsics for FP Pack and Store Operations
Intrinsics for Integer Pack and Unpack Operations
Intrinsics for Permutation Operations
Intrinsics for FP Permutation Operations
Intrinsics for Integer Permutation Operations
Intrinsics for Reduction Operations
Intrinsics for FP Reduction Operations
Intrinsics for Reduction Operations
Intrinsics for Set Operations
Intrinsics for Shuffle Operations
Intrinsics for FP Shuffle Operations
Intrinsics for Integer Shuffle Operations
Intrinsics for Test Operations
Intrinsics for Typecast Operations
Intrinsics for Vector Mask Operations
Intrinsics for Later Generation Intel® Core™ Processor Instruction Extensions
Overview: Intrinsics for 3rd Generation Intel® Core™ Processor Instruction Extensions
Overview: Intrinsics for 4th Generation Intel® Core™ Processor Instruction Extensions
Intrinsics for Converting Half Floats that Map to 3rd Generation Intel® Core™ Processor Instructions
_mm_cvtph_ps()
_mm256_cvtph_ps()
_mm_cvtps_ph()
_mm256_cvtps_ph()
Intrinsics that Generate Random Numbers of 16/32/64 Bit Wide Random Integers
_rdrand_u16(), _rdrand_u32(), _rdrand_u64()
_rdseed_u16(), _rdseed_u32(), _rdseed_u64()
Intrinsics for Multi-Precision Arithmetic
_addcarry_u32(), _addcarry_u64()
_addcarryx_u32(), _addcarryx_u64()
_subborrow_u32(), _subborrow_u64()
Intrinsics that Allow Reading from and Writing to the FS Base and GS Base Registers
_readfsbase_u32(), _readfsbase_u64()
_readgsbase_u32(), _readgsbase_u64()
_writefsbase_u32(), _writefsbase_u64()
_writegsbase_u32(), _writegsbase_u64()
Intrinsics for Intel® Advanced Vector Extensions 2
Overview: Intrinsics for Intel® Advanced Vector Extensions 2 Instructions
Intrinsics for Arithmetic Operations
_mm256_abs_epi8/16/32
_mm256_add_epi8/16/32/64
_mm256_adds_epi8/16
_mm256_adds_epu8/16
_mm256_sub_epi8/16/32/64
_mm256_subs_epi8/16
_mm256_subs_epu8/16
_mm256_avg_epu8/16
_mm256_hadd_epi16/32
_mm256_hadds_epi16
_mm256_hsub_epi16/32
_mm256_hsubs_epi16
_mm256_madd_epi16
_mm256_maddubs_epi16
_mm256_mul_epi32
_mm256_mul_epu32
_mm256_mulhi_epi16
_mm256_mulhi_epu16
_mm256_mullo_epi16/32
_mm256_mulhrs_epi16
_mm256_sign_epi8/16/32
_mm256_mpsadbw_epu8
_mm256_sad_epu8
Intrinsics for Arithmetic Shift Operations
_mm256_sra_epi16/32
_mm256_srai_epi16/32
_mm256_srav_epi32
_mm_srav_epi32
Intrinsics for Blend Operations
_mm_blend_epi32/ _mm256_blend_epi16/32
_mm256_blendv_epi8
Intrinsics for Bitwise Operations
_mm256_and_si256
_mm256_andnot_si256
_mm256_or_si256
_mm256_xor_si256
Intrinsics for Broadcast Operations
_mm_broadcastss_ps/ _mm256_broadcastss_ps
_mm_broadcastsd_pd/ _mm256_broadcastsd_pd
_mm_broadcastb_epi8/ _mm256_broadcastb_epi8
_mm_broadcastw_epi16/ _mm256_broadcastw_epi16
_mm_broadcastd_epi32/ _mm256_broadcastd_epi32
_mm_broadcastq_epi64/ _mm256_broadcastq_epi64
_mm256_broadcastsi128_si256
Intrinsics for Compare Operations
_mm256_cmpeq_epi8/16/32/64
_mm256_cmpgt_epi8/16/32/64
_mm256_max_epi8/16/32
_mm256_max_epu8/16/32
_mm256_min_epi8/16/32
_mm256_min_epu8/16/32
Intrinsics for Fused Multiply Add Operations
_mm_fmadd_pd/ _mm256_fmadd_pd
_mm_fmadd_ps/ _mm256_fmadd_ps
_mm_fmadd_sd
_mm_fmadd_ss
_mm_fmaddsub_pd/ _mm256_fmaddsub_pd
_mm_fmaddsub_ps/ _mm256_fmaddsub_ps
_mm_fmsub_pd/ _mm256_fmsub_pd
_mm_fmsub_ps/ _mm256_fmsub_ps
_mm_fmsub_sd
_mm_fmsub_ss
_mm_fmsubadd_pd/ _mm256_fmsubadd_pd
_mm_fmsubadd_ps/ _mm256_fmsubadd_ps
_mm_fnmadd_pd/ _mm256_fnmadd_pd
_mm_fnmadd_ps/ _mm256_fnmadd_ps
_mm_fnmadd_sd
_mm_fnmadd_ss
_mm_fnmsub_pd/ _mm256_fnmsub_pd
_mm_fnmsub_ps/ _mm256_fnmsub_ps
_mm_fnmsub_sd
_mm_fnmsub_ss
Intrinsics for GATHER Operations
_mm_mask_i32gather_pd/ _mm256_mask_i32gather_pd
_mm_i32gather_pd/ _mm256_i32gather_pd
_mm_mask_i64gather_pd/ _mm256_mask_i64gather_pd
_mm_i64gather_pd/ _mm256_i64gather_pd
_mm_mask_i32gather_ps/ _mm256_mask_i32gather_ps
_mm_i32gather_ps/ _mm256_i32gather_ps
_mm_mask_i64gather_ps/ _mm256_mask_i64gather_ps
_mm_i64gather_ps/ _mm256_i64gather_ps
_mm_mask_i32gather_epi32/ _mm256_mask_i32gather_epi32
_mm_i32gather_epi32/ _mm256_i32gather_epi32
_mm_mask_i32gather_epi64/ _mm256_mask_i32gather_epi64
_mm_i32gather_epi64/ _mm256_i32gather_epi64
_mm_mask_i64gather_epi32/ _mm256_mask_i64gather_epi32
_mm_i64gather_epi32/ _mm256_i64gather_epi32
_mm_mask_i64gather_epi64/ _mm256_mask_i64gather_epi64
_mm_i64gather_epi64/ _mm256_i64gather_epi64
Intrinsics for Logical Shift Operations
_mm256_sll_epi16/32/64
_mm256_slli_epi16/32/64
_mm256_sllv_epi32/64
_mm_sllv_epi32/64
_mm256_slli_si256
_mm256_srli_si256
_mm256_srl_epi16/32/64
_mm256_srli_epi16/32/64
_mm256_srlv_epi32/64
_mm_srlv_epi32/64
Intrinsics for Insert/Extract Operations
_mm256_inserti128_si256
_mm256_extracti128_si256
_mm256_insert_epi8/16/32/64
_mm256_extract_epi8/16/32/64
Intrinsics for Masked Load/Store Operations
_mm_maskload_epi32/64/ _mm256_maskload_epi32/64
_mm_maskstore_epi32/64/ _mm256_maskstore_epi32/64
Intrinsics for Miscellaneous Operations
_mm256_alignr_epi8
_mm256_movemask_epi8
_mm256_stream_load_si256
Intrinsics for Operations to Manipulate Integer Data at Bit-Granularity
_bextr_u32/64
_blsi_u32/64
_blsmsk_u32/64
_blsr_u32/64
_bzhi_u32/64
_pext_u32/64
_pdep_u32/64
_lzcnt_u32/64
_tzcnt_u32/64
Intrinsics for Pack/Unpack Operations
_mm256_packs_epi16/32
_mm256_packus_epi16/32
_mm256_unpackhi_epi8/16/32/64
_mm256_unpacklo_epi8/16/32/64
Intrinsics for Packed Move with Extend Operations
_mm256_cvtepi8_epi16/32/64
_mm256_cvtepi16_epi32/64
_mm256_cvtepi32_epi64
_mm256_cvtepu8_epi16/32/64
_mm256_cvtepu16_epi32/64
_mm256_cvtepu32_epi64
Intrinsics for Permute Operations
_mm256_permutevar8x32_epi32
_mm256_permutevar8x32_ps
_mm256_permute4x64_epi64
_mm256_permute4x64_pd
_mm256_permute2x128_si256
Intrinsics for Shuffle Operations
_mm256_shuffle_epi8
_mm256_shuffle_epi32
_mm256_shufflehi_epi16
_mm256_shufflelo_epi16
Intrinsics for Intel® Transactional Synchronization Extensions (Intel® TSX)
Overview
Programming Considerations
Restricted Transactional Memory Intrinsics
RTM Overview
_xtest
_xbegin
_xend
_xabort
Hardware Lock Elision Intrinsics (Windows*)
HLE Overview (Windows*)
Acquire _InterlockedCompareExchange Functions (Windows*)
Acquire _InterlockedExchangeAdd Functions (Windows*)
Release _InterlockedCompareExchange Functions (Windows*)
Release _InterlockedExchangeAdd Functions (Windows*)
Release _Store Functions (Windows*)
Function Prototypes and Macro Definitions (Windows*)
Intrinsics for Intel® Advanced Vector Extensions
Overview
Details of Intel® AVX Intrinsics and FMA Intrinsics
Intrinsics for Arithmetic Operations
_mm256_add_pd
_mm256_add_ps
_mm256_addsub_pd
_mm256_addsub_ps
_mm256_hadd_pd
_mm256_hadd_ps
_mm256_sub_pd
_mm256_sub_ps
_mm256_hsub_pd
_mm256_hsub_ps
_mm256_mul_pd
_mm256_mul_ps
_mm256_div_pd
_mm256_div_ps
_mm256_dp_ps
_mm256_sqrt_pd
_mm256_sqrt_ps
_mm256_rsqrt_ps
_mm256_rcp_ps
Intrinsics for Bitwise Operations
_mm256_and_pd
_mm256_and_ps
_mm256_andnot_pd
_mm256_andnot_ps
_mm256_or_pd
_mm256_or_ps
_mm256_xor_pd
_mm256_xor_ps
Intrinsics for Blend and Conditional Merge Operations
_mm256_blend_pd
_mm256_blend_ps
_mm256_blendv_pd
_mm256_blendv_ps
Intrinsics for Compare Operations
_mm_cmp_pd, _m256_cmp_pd
_mm_cmp_ps, _m256_cmp_ps
_mm_cmp_sd
_mm_cmp_ss
Intrinsics for Conversion Operations
_mm256_cvtepi32_pd
_mm256_cvtepi32_ps
_mm256_cvtpd_epi32
_mm256_cvtps_epi32
_mm256_cvtpd_ps
_mm256_cvtps_pd
_mm256_cvttp_epi32
_mm256_cvttps_epi32
_mm256_cvtsi256_si32
_mm256_cvtsd_f64
_mm256_cvtss_f32
Intrinsics to Determine Maximum and Minimum Values
_mm256_max_pd
_mm256_max_ps
_mm256_min_pd
_mm256_min_ps
Intrinsics for Load and Store Operations
_mm256_broadcast_pd
_mm256_broadcast_ps
_mm256_broadcast_sd
_mm256_broadcast_ss, _mm_broadcast_ss
_mm256_load_pd
_mm256_load_ps
_mm256_load_si256
_mm256_loadu_pd
_mm256_loadu_ps
_mm256_loadu_si256
_mm256_maskload_pd, _mm_maskload_pd
_mm256_maskload_ps, _mm_maskload_ps
_mm256_store_pd
_mm256_store_ps
_mm256_store_si256
_mm256_storeu_pd
_mm256_storeu_ps
_mm256_storeu_si256
_mm256_stream_pd
_mm256_stream_ps
_mm256_stream_si256
_mm256_maskstore_pd, _mm_maskstore_pd
_mm256_maskstore_ps, _mm_maskstore_ps
Intrinsics for Miscellaneous Operations
_mm256_extractf128_pd
_mm256_extractf128_ps
_mm256_extractf128_si256
_mm256_insertf128_pd
_mm256_insertf128_ps
_mm256_insertf128_si256
_mm256_lddqu_si256
_mm256_movedup_pd
_mm256_movehdup_ps
_mm256_moveldup_ps
_mm256_movemask_pd
_mm256_movemask_ps
_mm256_round_pd
_mm256_round_ps
_mm256_set_pd
_mm256_set_ps
_mm256_set_epi32
_mm256_setr_pd
_mm256_setr_ps
_mm256_setr_epi32
_mm256_set1_pd
_mm256_set1_ps
_mm256_set1_epi32
_mm256_setzero_pd
_mm256_setzero_ps
_mm256_setzero_si256
_mm256_zeroall
_mm256_zeroupper
Intrinsics for Packed Test Operations
_mm256_testz_si256
_mm256_testc_si256
_mm256_testnzc_si256
_mm256_testz_pd, _mm_testz_pd
_mm256_testz_ps, _mm_testz_ps
_mm256_testc_pd, _mm_testc_pd
_mm256_testc_ps, _mm_testc_ps
_mm256_testnzc_pd, _mm_testnzc_pd
_mm256_testnzc_ps, _mm_testnzc_ps
Intrinsics for Permute Operations
_mm256_permute_pd, _mm_permute_pd
_mm256_permute_ps
_mm256_permutevar_pd, _mm_permutevar_pd
_mm256_permutevar_ps
_mm256_permute2f128_pd
_mm256_permute2f128_ps
_mm256_permute2f128_si256
Intrinsics for Shuffle Operations
_mm256_shuffle_pd
_mm256_shuffle_ps
Intrinsics for Unpack and Interleave Operations
_mm256_unpackhi_pd
_mm256_unpackhi_ps
_mm256_unpacklo_pd
_mm256_unpacklo_ps
Support Intrinsics for Vector Typecasting Operations
_mm256_castpd_ps
_mm256_castps_pd
_mm256_castpd_si256
_mm256_castps_si256
_mm256_castsi256_pd
_mm256_castsi256_ps
_mm256_castpd128_pd256
_mm256_castpd256_pd128
_mm256_castps128_ps256
_mm256_castps256_ps128
_mm256_castsi128_si256
_mm256_castsi256_si128
Intrinsics Generating Vectors of Undefined Values
_mm256_undefined_ps()
_mm256_undefined_pd()
_mm256_undefined_si128()
Intrinsics for Intel® Streaming SIMD Extensions 4
Overview
Efficient Accelerated String and Text Processing
Overview
Packed Compare Intrinsics
Application Targeted Accelerators Intrinsics
Vectorizing Compiler and Media Accelerators
Overview: Vectorizing Compiler and Media Accelerators
Packed Blending Intrinsics
Floating Point Dot Product Intrinsics
Packed Format Conversion Intrinsics
Packed Integer Min/Max Intrinsics
Floating Point Rounding Intrinsics
DWORD Multiply Intrinsics
Register Insertion/Extraction Intrinsics
Test Intrinsics
Packed DWORD to Unsigned WORD Intrinsic
Packed Compare for Equal Intrinsic
Cacheability Support Intrinsic
Intrinsics for Intel® Supplemental Streaming SIMD Extensions 3
Overview
Addition Intrinsics
Subtraction Intrinsics
Multiplication Intrinsics
Absolute Value Intrinsics
Shuffle Intrinsics
Concatenate Intrinsics
Negation Intrinsics
Intrinsics for Intel® Streaming SIMD Extensions 3
Overview
Integer Vector Intrinsics
Single-precision Floating-point Vector Intrinsics
Double-precision Floating-point Vector Intrinsics
Miscellaneous Intrinsics
Intrinsics for Intel® Streaming SIMD Extensions 2
Overview
Macro Functions
Floating-point Intrinsics
Arithmetic Intrinsics
Logical Intrinsics
Compare Intrinsics
Conversion Intrinsics
Load Intrinsics
Set Intrinsics
Store Intrinsics
Integer Intrinsics
Arithmetic Intrinsics
Logical Intrinsics
Shift Intrinsics
Compare Intrinsics
Conversion Intrinsics
Move Intrinsics
Load Intrinsics
Set Intrinsics
Store Intrinsics
Miscellaneous Functions and Intrinsics
Cacheability Support Intrinsics
Miscellaneous Intrinsics
Casting Support Intrinsics
Pause Intrinsic
Macro Function for Shuffle
Intrinsics Returning Vectors of Undefined Values
Intrinsics for Intel® Streaming SIMD Extensions
Overview
Details about Intel® Streaming SIMD Extension Intrinsics
Writing Programs with Intel® Streaming SIMD Extensions Intrinsics
Arithmetic Intrinsics
Logical Intrinsics
Compare Intrinsics
Conversion Intrinsics
Load Intrinsics
Set Intrinsics
Store Intrinsics
Cacheability Support Intrinsics
Integer Intrinsics
Intrinsics to Read and Write Registers
Miscellaneous Intrinsics
Macro Functions
Macro Function for Shuffle Operations
Macro Functions to Read and Write Control Registers
Macro Function for Matrix Transposition
Intrinsics for MMX™ Technology
Overview
Details about MMX™ Technology Intrinsics
The EMMS Instruction: Why You Need It
EMMS Usage Guidelines
General Support Intrinsics
Packed Arithmetic Intrinsics
Shift Intrinsics
Logical Intrinsics
Compare Intrinsics
Set Intrinsics
Intrinsics for Advanced Encryption Standard Implementation
Overview
Intrinsics for Carry-less Multiplication Instruction and Advanced Encryption Standard Instructions
Intrinsics for Converting Half Floats
Overview
Intrinsics for Converting Half Floats
Intrinsics for Short Vector Math Library Operations
Overview
Intrinsics for Division Operations
_mm_div_epi8/ _mm256_div_epi8
_mm_div_epi16/ _mm256_div_epi16
_mm_div_epi32/ _mm256_div_epi32
_mm_div_epi64/ _mm256_div_epi64
_mm_div_epu8/ _mm256_div_epu8
_mm_div_epu16/ _mm256_div_epu16
_mm_div_epu32/ _mm256_div_epu32
_mm_div_epu64/ _mm256_div_epu64
_mm_rem_epi8/ _mm256_rem_epi8
_mm_rem_epi16/ _mm256_rem_epi16
_mm_rem_epi32/ _mm256_rem_epi32
_mm_rem_epi64/ _mm256_rem_epi64
_mm_rem_epu8/ _mm256_rem_epu8
_mm_rem_epu16/ _mm256_rem_epu16
_mm_rem_epu32/ _mm256_rem_epu32
_mm_rem_epu64/ _mm256_rem_epu64
Intrinsics for Error Function Operations
_mm_cdfnorminv_pd, _mm256_cdfnorminv_pd
_mm_cdfnorminv_ps, _mm256_cdfnorminv_ps
_mm_erf_pd, _mm256_erf_pd
_mm_erf_ps, _mm256_erf_ps
_mm_erfc_pd, _mm256_erfc_pd
_mm_erfc_ps, _mm256_erfc_ps
_mm_erfinv_pd, _mm256_erfinv_pd
_mm_erfinv_ps, _mm256_erfinv_ps
Intrinsics for Exponential Operations
_mm_exp2_pd, _mm256_exp2_pd
_mm_exp2_ps, _mm256_exp2_ps
_mm_exp_pd, _mm256_exp_pd
_mm_exp_ps, _mm256_exp_ps
_mm_exp10_pd, _mm256_exp10_pd
_mm_exp10_ps, _mm256_exp10_ps
_mm_expm1_pd, _mm256_expm1_pd
_mm_expm1_ps, _mm256_expm1_ps
_mm_cexp_ps, _mm256_cexp_ps
_mm_pow_pd, _mm256_pow_pd
_mm_pow_ps, _mm256_pow_ps
_mm_hypot_pd _mm256_hypot_pd
_mm_hypot_ps _mm256_hypot_ps
Intrinsics for Logarithmic Operations
_mm_log2_pd, _mm256_log2_pd
_mm_log2_ps, _mm256_log2_ps
_mm_log10_pd, _mm256_log10_pd
_mm_log10_ps, _mm256_log10_ps
_mm_log_pd, _mm256_log_pd
_mm_log_ps, _mm256_log_ps
_mm_logb_pd, _mm256_logb_pd
_mm_logb_ps, _mm256_logb_ps
_mm_log1p_pd, _mm256_log1p_pd
_mm_log1p_ps, _mm256_log1p_ps
_mm_clog_ps, _mm256_clog_ps
Intrinsics for Square Root and Cube Root Operations
_mm_sqrt_pd, _mm256_sqrt_pd
_mm_sqrt_ps, _mm256_sqrt_ps
_mm_invsqrt_pd, _mm256_invsqrt_pd
_mm_invsqrt_ps, _mm256_invsqrt_ps
_mm_cbrt_pd, _mm256_cbrt_pd
_mm_cbrt_ps, _mm256_cbrt_ps
_mm_invcbrt_pd, _mm256_invcbrt_pd
_mm_invcbrt_ps, _mm256_invcbrt_ps
_mm_csqrt_ps, _mm256_csqrt_ps
Intrinsics for Trigonometric Operations
_mm_acos_pd, _mm256_acos_pd
_mm_acos_ps, _mm256_acos_ps
_mm_acosh_pd, _mm256_acosh_pd
_mm_acosh_ps, _mm256_acosh_ps
_mm_asin_pd, _mm256_asin_pd
_mm_asin_ps, _mm256_asin_ps
_mm_asinh_pd, _mm256_asinh_pd
_mm_asinh_ps, _mm256_asinh_ps
_mm_atan_pd, _mm256_atan_pd
_mm_atan_ps, _mm256_atan_ps
_mm_atan2_pd, _mm256_atan2_pd
_mm_atan2_ps, _mm256_atan2_ps
_mm_atanh_pd, _mm256_atanh_pd
_mm_atanh_ps, _mm256_atanh_ps
_mm_cos_pd, _mm256_cos_pd
_mm_cos_ps, _mm256_cos_ps
_mm_cosd_pd, _mm256_cosd_pd
_mm_cosd_ps, _mm256_cosd_ps
_mm_cosh_pd, _mm256_cosh_pd
_mm_cosh_ps, _mm256_cosh_ps
_mm_sin_pd, _mm256_sin_pd
_mm_sin_ps, _mm256_sin_ps
_mm_sind_pd, _mm256_sind_pd
_mm_sind_ps, _mm256_sind_ps
_mm_sinh_pd, _mm256_sinh_pd
_mm_sinh_ps, _mm256_sinh_ps
_mm_tan_pd, _mm256_tan_pd
_mm_tan_ps, _mm256_tan_ps
_mm_tand_pd, _mm256_tand_pd
_mm_tand_ps, _mm256_tand_ps
_mm_tanh_pd, _mm256_tanh_pd
_mm_tanh_ps, _mm256_tanh_ps
_mm_sincos_pd, _mm256_sincos_pd
_mm_sincos_ps, _mm256_sincos_ps
Libraries
Creating Libraries
Using Intel Shared Libraries
Managing Libraries
Redistributing Libraries When Deploying Applications
Intel®'s Memory Allocator Library
SIMD Data Layout Templates (SDLT)
Usage Guidelines: Function Calls and Containers
Constructing an n_container
Bounds
User-Level Interface
SDLT Primitives (SDLT_PRIMITIVE)
soa1d_container
aos1d_container
access_by
n_container
Layouts
Shape
n_extent_generator
make_n_container template function
extent_d template function
Bounds
bounds_t
bounds Template Function
n_bounds_t
n_bounds_generator
bounds_d Template Function
Accessors
soa1d_container::accessor and aos1d_container::accessor
soa1d_container::const_accessor and aos1d_container::const_accessor
Accessor Concept
Proxy Objects
Proxy
ConstProxy
Number Representation
Indexes
linear_index
n_index_t
n_index_generator
index_d template function
Conveniences and Correctness
max_val
min_val
Examples
Example 1
Example 2
Example 3
Example 4
Example 5
Intel® C++ Class Libraries
C++ Classes and SIMD Operations
Capabilities of C++ SIMD Classes
Integer Vector Classes
Terms, Conventions, and Syntax Defined
Rules for Operators
Assignment Operator
Logical Operators
Addition and Subtraction Operators
Multiplication Operators
Shift Operators
Comparison Operators
Conditional Select Operators
Debug Operations
Unpack Operators
Pack Operators
Clear MMX™ State Operator
Integer Functions for Streaming SIMD Extensions
Conversions between Fvec and Ivec
Floating-point Vector Classes
Fvec Notation Conventions
Data Alignment
Conversions
Constructors and Initialization
Arithmetic Operators
Minimum and Maximum Operators
Logical Operators
Compare Operators
Conditional Select Operators for Fvec Classes
Cacheability Support Operators
Debug Operations
Load and Store Operators
Unpack Operators
Move Mask Operators
Classes Quick Reference
Programming Example
C++ Library Extensions
Intel's valarray implementation
Using Intel's valarray Implementation
Intel® C++ Asynchronous I/O Extensions for Windows*
Intel® C++ Asynchronous I/O Library for Windows*
aio_read
aio_write
Example for aio_read and aio_write Functions
aio_suspend
Example for aio_suspend Function
aio_error
aio_return
Example for aio_error and aio_return Functions
aio_fsync
aio_cancel
Example for aio_cancel Function
lio_listio
Example for lio_listio Function
Handling Errors Caused by Asynchronous I/O Functions
Intel® C++ Asynchronous I/O Class for Windows*
Template Class async_class
get_last_operation_id
wait
get_status
get_last_error
get_error_operation_id
stop_queue
resume_queue
clear_queue
Example for Using async_class Template Class
Intel® IEEE 754-2008 Binary Floating-Point Conformance Library
Overview: IEEE 754-2008 Binary Floating-Point Conformance Library
Using the IEEE 754-2008 Binary Floating-point Conformance Library
Function List
Homogeneous General-Computational Operations Functions
General-Computational Operations Functions
Quiet-Computational Operations Functions
Signaling-Computational Operations Functions
Non-Computational Operations Functions
Intel's String and Numeric Conversion Library
Overview
Function List
Macros
ISO Standard Predefined Macros
Additional Predefined Macros
Pragmas
Intel-specific Pragma Reference
alloc_section
block_loop/noblock_loop
code_align
distribute_point
inline/noinline/forceinline
intel_omp_task
intel_omp_taskq
ivdep
loop_count
nofusion
novector
omp simd early_exit
optimize
optimization_level
optimization_parameter
prefetch/noprefetch
simd
simdoff
unroll nounroll
unroll_and_jam nounroll_and_jam
vector
Intel-supported Pragma Reference
Error Handling
Warnings and Errors
Compilation
Supported Environment Variables
Compilation Phases
Passing Options to the Linker
Using Configuration Files
Using Response Files
Global Symbols and Visibility Attributes
Specifying Symbol Visibility Explicitly
Saving Compiler Information in Your Executable
Linking Debug Information
Ahead of Time Compilation
Compile Host Programs with a Third Party Compiler
Optimization and Programming Guide
Extensions
Unified Shared Memory
Concurrent Functions
mem_advise
Explicit Functions
aligned_alloc (1)
aligned_alloc (2)
fill (1)
malloc (1)
malloc (2)
memcpy
memset
General Functions
aligned_alloc (1)
aligned_alloc (2)
free (1)
free (2)
malloc (1)
malloc (2)
Restricted Functions
aligned_alloc_host (1)
aligned_alloc_host (2)
aligned_alloc_shared (1)
aligned_alloc_shared (2)
malloc_host (1)
malloc_host (2)
malloc_shared (1)
malloc_shared (2)
prefetch
Informational Functions
get_pointer_device
get_pointer_type
Sub-groups for NDRange Parallelism
Core Functionality
Common Member Functions
Synchronization Functions
Vote/Ballot
Collectives
Extended Functionality
Shuffles
Two-Input Shuffles
Loads/Stores
C and C++ Standard Libraries Support
Queue Order Properties
SYCL_INTEL_unnamed_kernel_lambda
DPC++ L0 Switch
Automatic Parallelization
Programming with Auto-parallelization
Enabling Further Loop Parallelization for Multicore Platforms
Language Support for Auto-parallelization
Vectorization
Automatic Vectorization
Automatic Vectorization Overview
Programming Guidelines for Vectorization
Using Automatic Vectorization
Vectorization and Loops
Loop Constructs
Explicit Vector Programming
User-Mandated or SIMD Vectorization
SIMD-Enabled Functions
SIMD-Enabled Function Pointers
SIMD Vectorization Using the _Simd Keyword
Function Annotations and the SIMD Directive for Vectorization
High-Level Optimization (HLO)
Interprocedural Optimization (IPO)
Using IPO
IPO-Related Performance Issues
IPO for Large Programs
Understanding Code Layout and Multi-Object IPO
Creating a Library from IPO Objects
Inline Expansion of Functions
Compiler Directed Inline Expansion of Functions
Developer Directed Inline Expansion of User Functions
Methods to Optimize Code Size
Disable or Decrease the Amount of Inlining
Strip Symbols from Your Binaries
Exclude Unused Code and Data from the Executable
Disable Recognition and Expansion of Intrinsic Functions
Optimize Exception Handling Data (Linux)
Disable Loop Unrolling
Avoid References to Compiler-Specific Libraries
Intel® Math Library
Overview: Intel® Math Library
Using Intel® Math Library
Math Functions
Function List
Trigonometric Functions
Hyperbolic Functions
Exponential Functions
Special Functions
Nearest Integer Functions
Remainder Functions
Miscellaneous Functions
Complex Functions
C99 Macros
Compatibility and Portability
Conformance to the C/C++ Standards
gcc* Compatibility and Interoperability
Microsoft* Compatibility
Precompiled Header Support
Compilation and Execution Differences
Enum Bit-Field Signedness
Portability
Porting from the Microsoft* Compiler to the Intel® Compiler
Overview: Porting from the Microsoft* Compiler to the Intel® Compiler
Modifying Your makefile
Other Considerations
Porting from gcc* to the Intel® C++ Compiler
Overview: Porting from gcc* to the Intel® Compiler
Modifying Your makefile
Equivalent Macros
Other Considerations
Notices and Disclaimers