__assume_aligned
__declspec
align
align_value
concurrency_safe
const
cpu_dispatch, cpu_specific
mpx
align
align_value
concurrency_safe
const
cpu_dispatch
cpu_specific
mpx
__regcall
_InterlockedCompareExchange_HLEAcquire
_InterlockedCompareExchange_HLERelease
_InterlockedCompareExchange64_HLEAcquire
_InterlockedCompareExchange64_HLERelease
_InterlockedCompareExchangePointer_HLEAcquire
_InterlockedCompareExchangePointer_HLERelease
_InterlockedExchangeAdd_HLEAcquire
_InterlockedExchangeAdd_HLERelease
_InterlockedExchangeAdd64_HLEAcquire
_InterlockedExchangeAdd64_HLERelease
_may_i_use_cpu_feature
_mm_div_epi16
_mm_div_epi32
_mm_div_epi64
_mm_div_epi8/
_mm_div_epu16
_mm_div_epu32
_mm_div_epu64
_mm_div_epu8
_mm_rem_epi16
_mm_rem_epi32
_mm_rem_epi64
_mm_rem_epi8
_mm_rem_epu16
_mm_rem_epu32
_mm_rem_epu64
_mm_rem_epu8
_mm256_div_epi16
_mm256_div_epi32
_mm256_div_epi64
_mm256_div_epi8
_mm256_div_epu16
_mm256_div_epu32
_mm256_div_epu64
_mm256_div_epu8
_mm256_rem_epi16
_mm256_rem_epi32
_mm256_rem_epi64
_mm256_rem_epi8
_mm256_rem_epu16
_mm256_rem_epu32
_mm256_rem_epu64
_mm256_rem_epu8
_rdseed16_step
_rdseed32_step
_rdseed64_step
_Simd keyword
_Store_HLERelease
_Store64_HLERelease
_StorePointer_HLERelease
_xabort
_xbegin
_xend
_xtest
--sysroot compiler option (Linux* only)
--version compiler option
-ansi compiler option
-B compiler option
-c compiler option
c
Creating Libraries
-C compiler option
-D compiler option
-daal compiler option
-dD compiler option
-device-math-lib compiler option
-dM compiler option
-dumpmachine compiler option
-dumpversion compiler option
-E compiler option
-EP compiler option
-Fa compiler option
-fasm-blocks compiler option
-fasynchronous-unwind-tables compiler option
-fbuiltin compiler option
-fcommon compiler option
-fdata-sections compiler option
-fexceptions compiler option
-ffreestanding compiler option
-ffunction-sections compiler option
-fgnu89-inline compiler option
-finline compiler option
-finline-functions compiler options
-fintelfpga compiler option
-fjump-tables compiler option
-fkeep-static-consts compiler option
-fmath-errno compiler option
-fno-asynchronous-unwind-tables compiler option
-fno-gnu-keywords compiler option
-fno-operator-names compiler option
-fno-rtti compiler option
-foffload-static-lib compiler option
-fomit-frame-pointer compiler option
-foptimize-sibling-calls compiler option
-fp compiler option
-fpack-struct compiler option
-fpermissive compiler option
-fpic compiler option
fpic
Creating Libraries
-fpie compiler option (Linux* only)
-fshort-enums compiler option
-fstack-protector compiler option
-fstack-protector-all compiler option
-fstack-protector-strong compiler option
-fsycl compiler option
-fsycl-add-targets compiler option
-fsycl-device-only compiler option
-fsycl-link compiler option
-fsycl-link-targets compiler option
-fsycl-targets compiler option
-fsycl-unnamed-lambda compiler option
-fsycl-use-bitcode compiler option
-fsyntax-only compiler option
-ftrapuv compiler option
-funroll-loops compiler option
-funsigned-char compiler option
-fuse-ld compiler option
-fverbose-asm compiler option
-fzero-initialized-in-bss compiler option
-g compiler option
-g0 compiler option
-g1 compiler option
-g2 compiler option
-g3 compiler option
-gdwarf-2 compiler option
-gdwarf-3 compiler option
-gdwarf-4 compiler option
-grecord-gcc-switches compiler option (Linux* only)
-gsplit-dwarf compiler option (Linux* only)
-H compiler option
-help compiler option
-I compiler option
-I- compiler option
-idirafter compiler option
-imacros compiler option
-ipo compiler option
Using IPO
ipo, Qipo
-ipp compiler option
-ipp-link compiler option
-iprefix compiler option
-iquote compiler option
-isystem compiler option
-iwithprefix compiler option
-iwithprefixbefore compiler option
-Kc++ compiler option
-l compiler option
-L compiler option
-M compiler option
-m64 compiler option
-m80387 compiler option
-march compiler option
-masm compiler option (Linux* only)
-mcmodel compiler option (Linux* only)
-MD compiler option
-MF compiler option
-MG compiler option
-mintrinsic-promote compiler option
-mkl compiler option
-MM compiler option
-MMD compiler option
-momit-leaf-frame-pointer
-MP compiler option
-MQ compiler option
-MT compiler option
-nodefaultlibs compiler option
-nostartfiles compiler option
-nostdinc++ compiler option
-nostdlib compiler option
-o compiler option
-O compiler option
-Ofast compiler option
-Os compiler option
-P compiler option
-pie compiler option
-print-multi-lib compiler option
-pthread compiler option
-regcall compiler option
-S compiler option
-save-temps compiler option
-shared compiler option
Using Intel Shared Libraries
Creating Libraries
-shared compiler option (Linux* only)
-shared-libgcc compiler option (Linux* only)
-static compiler option (Linux* only)
-static-libgcc compiler option (Linux* only)
-static-libstdc++ compiler option (Linux* only)
-std compiler option
-T compiler option (Linux* only)
-tbb compiler option
-u compiler option
-U compiler option
-undef compiler option
-unroll compiler option
-use-msasm compiler option
-v compiler option
-w compiler option
w
w, W
-Wa compiler option
-Wabi compiler option
-Wall compiler option
-Wcomment compiler option
-Wdeprecated compiler option
-Weffc++ compiler option
-Werror compiler option
-Werror-all compiler option
-Wextra-tokens compiler option
-Wformat compiler option
-Wformat-security compiler option
-Wl compiler option
-Wmain compiler option
-Wmissing-declarations compiler option
-Wmissing-prototypes compiler option
-Wp compiler option
-Wpointer-arith compiler option
-Wreorder compiler option
-Wreturn-type compiler option
-Wshadow compiler option
-Wsign-compare compiler option
-Wstrict-aliasing compiler option
-Wstrict-prototypes compiler option
-Wtrigraphs compiler option
-Wuninitialized compiler option
-Wunknown-pragmas compiler option
-Wunused-function compiler option
-Wunused-variable compiler option
-Wwrite-strings compiler option
-x (type) compiler option
-X compiler option
-Xlinker compiler option
-Xs compiler option
-Xsycl-target compiler option
-Zp compiler option
/c compiler option
/C compiler option
/D compiler option
/device-math-lib compiler option
/E compiler option
/EH compiler option
/EP compiler option
/Fa compiler option
/FA compiler option
/FC compiler option
/Fd compiler option
/FD compiler option
/Fe compiler option
/FI compiler option
/Fo compiler option
/Fp compiler option
/GA compiler option
/Gd compiler option
/GF compiler option
/Gm compiler option
/GR compiler option
/Gs compiler option
/GS compiler option
/guard compiler option
/guard:cf compiler option
/Gv compiler option
/GX compiler option
/Gy compiler option
/help compiler option
/I compiler option
/I- compiler option
/J compiler option
/LD compiler option
LD
Creating Libraries
/link compiler option
/MD compiler option
MD
Creating Libraries
/MT compiler option
Creating Libraries
MT
/nologo compiler option
/O compiler option
/Od compiler option
/Oi compiler option
/Os compiler option
/Ot compiler option
/Ox compiler option
/P compiler option
/Qcxx-features compiler option
/Qdaal compiler option
/QdD compiler option
/QdM compiler option
/Qeffc++ compiler option
/Qfreestanding compiler option
/QH compiler option
/Qintrinsic-promote compiler option
/Qipo compiler option
Using IPO
ipo, Qipo
/Qipp compiler option
/Qipp-link compiler option
/QM compiler option
/QMD compiler option
/QMF compiler option
/QMG compiler option
/Qmkl compiler option
/QMM compiler option
/QMMD compiler option
/QMT compiler option
/Qno-builtin-name compiler option
/Qpatchable-addresses compiler option
/Qregcall compiler option
/Qtbb compiler option
/Qunroll compiler option
/Qzero-initialized-in-bss compiler option
/RTC compiler option
/S compiler option
/showIncludes compiler option
/Tc compiler option
/TC compiler option
/Tp compiler option
/TP compiler option
/U compiler option
/vd compiler option
/vmg compiler option
/vmv compiler option
/w compiler option
/W compiler option
/Wall compiler option
/Werror-all compiler option
/WX compiler option
/X compiler option
/Y- compiler option
/Yc compiler option
/Yu compiler option
/Z7 compiler option
/Zc compiler option
/Zi compiler option
/ZI compiler option
/Zl compiler option
Zl
Creating Libraries
/Zp compiler option
/Zs compiler option
3rd Generation Intel® Core™ Processor Instruction Extensions
4th Generation Intel® Core™ Processor Instruction Extensions
access_by
adding files
adding the compiler
Adding the Compiler to Eclipse*
in Eclipse*
Advanced Vector Extensions
Intrinsics for Arithmetic Operations
Intrinsics for Bitwise Operations
Intrinsics for Blend and Conditional Merge Operations
Intrinsics for Compare Operations
Intrinsics for Conversion Operations
Intrinsics for Load and Store Operations
Intrinsics to Determine Minimum and Maximum Values
Intrinsics for Miscellaneous Operations
Details of Intel® Advanced Vector Extensions Intrinsics
Overview: Intrinsics for Intel® Advanced Vector Extensions Instructions
Intrinsics for Packed Test Operations
Intrinsics for Permute Operations
Intrinsics for Shuffle Operations
Intrinsics for Unpack and Interleave Operations
Intrinsics Generating Vectors of Undefined Values
Support Intrinsics for Vector Typecasting Operations
arithmetic operations
bitwise logical operations
blend and conditional merge operations
compare operations
conversion operations
load operations
minimum and maximum operations
miscellaneous operations
overview
Details of Intel® Advanced Vector Extensions Intrinsics
Overview: Intrinsics for Intel® Advanced Vector Extensions Instructions
packed test operations
permute operations
shuffle operations
store operations
unpack and interleave operations
vector generation operations
vector typecasting operations
Advanced Vector Extensions 2
Intrinsics for Arithmetic Operations
Intrinsics for Arithmetic Shift Operations
Intrinsics for Operations to Manipulate Integer Data at Bit-Granularity
Intrinsics for Bitwise Operations
Intrinsics for Blend Operations
Intrinsics for Broadcast Operations
Intrinsics for Compare Operations
Intrinsics for Fused Multiply Add Operations
Intrinsics for GATHER Operations
Intrinsics for Insert/Extract Operations
Intrinsics for Masked Load/Store Operations
Intrinsics for Logical Shift Operations
Intrinsics for Miscellaneous Operations
Intrinsics for Pack/Unpack Operations
Intrinsics for Packed Move with Extend Operations
Intrinsics for Permute Operations
Intrinsics for Shuffle Operations
arithmetic operations
arithmetic shift operations
bit manipulation operations
bitwise logical operations
blend operations
broadcast operations
compare operations
fused multiply-add (FMA) operations
GATHER operations
insert and extract operations
load and store operations
logical shift operations
miscellaneous operations
pack and unpack operations
packed move operations
permute operations
shuffle operations
align
Function Annotations and the SIMD Directive for Vectorization
attribute
align_value
attribute
aligned
align
attribute
alloc_section
ALLOCATABLE
Programming with Auto-parallelization
data flow
ANSI/ISO standard
aos1d_container
n_container
aos1d_container
max_val
n_index_generator
index_d template function
Constructing an n_container
Shape
n_extent_generator
Bounds
n_index_t (needs new content)
make_ n_container template function
extent_d template function
aos1d_container::accessor
Accessor Concept
n_bounds_generator
n_bounds_t
soa1d_container::accessor and aos1d_container::accessor
bounds_t
bounds_d Template Function
sdlt::bounds Template Function
aos1d_container::const_accessor
applications
Redistributing Libraries When Deploying Applications
O
deploying
option specifying code optimization for
ar tool
assembler
Wa
option passing options to
assembler output file
masm
option specifying a dialect for
assembly files
Specifying Assembly Files
naming
assembly listing file
Fa
option specifying generation of
Asynchronous I/O async_class methods
clear_queue
get_error_operation_id
get_last_error
get_last_operation_id
get_status
resume_queue
stop_queue
wait
clear_queue()
get_error_operation_id()
get_last_error()
get_last_operation_id()
get_status()
resume_queue()
stop_queue()
wait()
Asynchronous I/O Extensions
Intel's C++ Asynchronous I/O Extensions for Windows* Operating Systems
Intel's C++ Asynchronous I/O Library for Windows* Operating Systems
Intel's C++ Asynchronous I/O Class for Windows* Operating Systems
introduction
library
template class
Asynchronous I/O library functions
aio_cancel
aio_error
aio_fsync
aio_read
aio_return
aio_suspend
aio_write
Handling Errors Caused by Asynchronous I/O Functions
Example for aio_cancel Function
Example for aio_error and aio_return Functions
Example for aio_read and aio_write Functions
Example for aio_suspend Function
Example for lio_listio Function
lio_listio
aio_cancel()
aio_error()
aio_fsync()
aio_read()
aio_return()
aio_suspend()
aio_write()
errno macro
Error Handling
examples
Example for aio_cancel Function
Example for aio_error and aio_return Functions
Example for aio_read and aio_write Functions
Example for aio_suspend Function
Example for lio_listio Function
aio_cancel()
aio_error()
aio_read()
Example for aio_read and aio_write Functions
aio_write()
aio_return
aio_suspend()
aio_write()
lio_listio()
lio_listio()
Asynchronous I/O template class
Template Class async_class
async_class
thread_control
attribute
align
align_value
concurrency_safe
const
cpu_dispatch, cpu_specific
mpx
align
align_value
aligned
concurrency_safe
const
cpu_dispatch
cpu_specific
mpx
auto-parallelization
Automatic Parallelization
guidelines
overview
programming with
Auto-parallelization
Language Support for Auto-parallelization
language support
auto-parallelizer
auto-vectorization
auto-vectorization hints
auto-vectorization of innermost loops
auto-vectorizer
Vectorization
Automatic Vectorization Overview
Automatic Vectorization
Using Automatic Vectorization
AVX
Vectorization
Automatic Vectorization Overview
Automatic Vectorization
SSE
Vectorization
Automatic Vectorization Overview
Automatic Vectorization
SSE2
Vectorization
Automatic Vectorization Overview
Automatic Vectorization
SSE3
Vectorization
Automatic Vectorization Overview
Automatic Vectorization
SSSE3
Vectorization
Automatic Vectorization Overview
Automatic Vectorization
using
avoid
Avoiding Mixed Data Type Arithmetic Expressions
inefficient data types
mixed arithmetic expressions
AVX
Intrinsics for Arithmetic Operations
Intrinsics for Bitwise Operations
Intrinsics for Blend and Conditional Merge Operations
Intrinsics for Compare Operations
Intrinsics for Conversion Operations
Intrinsics for Load and Store Operations
Intrinsics to Determine Minimum and Maximum Values
Intrinsics for Miscellaneous Operations
Details of Intel® Advanced Vector Extensions Intrinsics
Overview: Intrinsics for Intel® Advanced Vector Extensions Instructions
Intrinsics for Packed Test Operations
Intrinsics for Permute Operations
Intrinsics for Shuffle Operations
Intrinsics for Unpack and Interleave Operations
Intrinsics Generating Vectors of Undefined Values
Support Intrinsics for Vector Typecasting Operations
arithmetic operations
bitwise logical operations
blend and conditional merge operations
compare operations
conversion operations
load operations
minimum and maximum operations
miscellaneous operations
overview
Details of Intel® Advanced Vector Extensions Intrinsics
Overview: Intrinsics for Intel® Advanced Vector Extensions Instructions
packed test operations
permute operations
shuffle operations
store operations
unpack and interleave operations
vector generation operations
vector typecasting operations
AVX2
Intrinsics for Arithmetic Operations
Intrinsics for Arithmetic Shift Operations
Intrinsics for Operations to Manipulate Integer Data at Bit-Granularity
Intrinsics for Bitwise Operations
Intrinsics for Blend Operations
Intrinsics for Broadcast Operations
Intrinsics for Compare Operations
Intrinsics for Fused Multiply Add Operations
Intrinsics for GATHER Operations
Intrinsics for Insert/Extract Operations
Intrinsics for Masked Load/Store Operations
Intrinsics for Logical Shift Operations
Intrinsics for Miscellaneous Operations
Intrinsics for Pack/Unpack Operations
Intrinsics for Packed Move with Extend Operations
Intrinsics for Permute Operations
Intrinsics for Shuffle Operations
arithmetic operations
arithmetic shift operations
bit manipulation operations
bitwise logical operations
blend operations
broadcast operations
compare operations
fused multiply-add (FMA) operations
GATHER operations
insert and extract operations
load and store operations
logical shift operations
miscellaneous operations
pack and unpack operations
packed move operations
permute operations
shuffle operations
bit fields and signs
block_loop
building a project
with Eclipse*
building with Intel® C++
C and C++ standard libraries support
C++0x
std
option enabling support of
C++11
std
option enabling support of
c99
std
option enabling support of
calling conventions
capturing IPO output
Class Libraries
C++ Classes and SIMD Operations
Capabilities of C++ SIMD Classes
Terms, Conventions, and Syntax Defined
Arithmetic Operators
Cacheability Support Operators
Compare Operators
Conditional Select Operators for Fvec Classes
Constructors and Initialization
Conversions
Data Alignment
Debug Operations
Load and Store Operators
Logical Operators
Minimum and Maximum Operators
Move Mask Operators
Fvec Notation Conventions
Floating-point Vector Classes
Unpack Operators
Addition and Subtraction Operators
Assignment Operator
Clear MMX™ State Operator
Comparison Operators
Conditional Select Operators
Conversions between Fvec and Ivec
Debug Operations
Integer Functions for Streaming SIMD Extensions
Integer Vector Classes
Logical Operators
Multiplication Operators
Pack Operators
Rules for Operators
Shift Operators
Unpack Operators
Classes Quick Reference
C++ classes and SIMD operations
capabilities of C++ SIMD classes
conventions
floating-point vector classes
Arithmetic Operators
Cacheability Support Operators
Compare Operators
Conditional Select Operators for Fvec Classes
Constructors and Initialization
Conversions
Data Alignment
Debug Operations
Load and Store Operators
Logical Operators
Minimum and Maximum Operators
Move Mask Operators
Fvec Notation Conventions
Floating-point Vector Classes
Unpack Operators
arithmetic operators
cacheability support operators
compare operators
conditional select operators
constructors and initialization
conversions
data alignment
debug operators
load operators
logical operators
minimum and maximum operators
move mask operators
notation conventions
overview
store operators
unpack operators
integer vector classes
Addition and Subtraction Operators
Assignment Operator
Clear MMX™ State Operator
Comparison Operators
Conditional Select Operators
Conversions between Fvec and Ivec
Debug Operations
Integer Functions for Streaming SIMD Extensions
Integer Vector Classes
Logical Operators
Multiplication Operators
Pack Operators
Rules for Operators
Shift Operators
Unpack Operators
addition operators
Addition and Subtraction Operators
subtraction operators
assignment operator
clear MMX™ state operators
comparison operators
conditional select operators
conversions between fvec and ivec
debug operators
Debug Operations
element access operator
element assignment operators
functions for SSE
ivec classes
logical operators
multiplication operators
pack operators
rules for operators
shift operators
unpack operators
Quick reference
syntax
terms
Classes
Programming Example
programming example
code
Methods to Optimize Code Size
march
methods to optimize size of
option generating for specified CPU
code
Microsoft Compatibility
mixing managed and unmanaged
code layout
code size
Methods to Optimize Code Size
Disable or Decrease the Amount of Inlining
Disable Recognition and Expansion of Intrinsic Functions
Disable Loop Unrolling (Linux*)
Exclude Unused Code and Data from the Executable
Optimize Exception Handling Data (Linux*)
Strip Symbols from Your Binaries
Avoid References to Compiler-Specific Libraries (Linux*)
methods to optimize
option affecting inlining
option disabling expansion of functions
option disabling loop unrolling
option excluding data
option for certain exception handling
option stripping symbols
option to avoid library references
code_align
command line
command-line window
Using the Command Line on Windows*
setting up
compatibility
Microsoft Compatibility
with Microsoft* Visual Studio*
compilation phases
compilation units
compiler
Introducing the Intel® oneAPI DPC++ Compiler
Related Information
overview
Introducing the Intel® oneAPI DPC++ Compiler
Related Information
compiler
Compilation Phases
compilation phases
compiler command-line options
grecord-gcc-switches
option recording
compiler differences
Compilation and Execution Differences
between Intel® C++ and Microsoft* Visual C++*
compiler directives
Explicit Vector Programming
Vectorization
Automatic Vectorization Overview
Automatic Vectorization
for vectorization
Explicit Vector Programming
Vectorization
Automatic Vectorization Overview
Automatic Vectorization
compiler information
Saving Compiler Information in Your Executable
saving in your executable
compiler operation
Understanding File Extensions
Invoking the Intel® oneAPI DPC++ Compiler
input files
invoking from the command line
compiler options
Alphabetical List of Compiler Options
Specifying Symbol Visibility Explicitly (Linux*)
General Rules for Compiler Options
Displaying General Option Information From the Command Line
Passing Options to the Linker
What Appears in the Compiler Option Descriptions
alphabetical list of
for visibility
general rules for
how to display informational lists
linker-related
overview of descriptions of
compiler options
Using Compiler Options
Other Considerations
Other Considerations
Portability Options
GCC-Compatible Warning Options
command-line syntax
for optimization
Other Considerations
Other Considerations
for portability
gcc-compatible warning
option categories
using
compiler selection
Selecting the Compiler Version
in Visual Studio*
compiler setup
compilers
Multi-Version Compiler Support
using multiple versions
compilervars environment script
compilervars.bat
compiling
Other Considerations
compiling considerations
compiling
Other Considerations
gcc* code with Intel® C++ Compiler
compiling considerations
compiling large programs
compiling with IPO
concurrency_safe
attribute
conditional parallel region execution
Compiler Directed Inline Expansion of Functions
inline expansion
configuration files
configurations
Selecting a Configuration
debug and release
const
attribute
conventions
Notational Conventions
in the documentation
converting to Intel® C++ Compiler project system
correct usage of countable loop
COS
Loop Constructs
correct usage of
CPU
march
option generating code for specified
CPU time
Inline Expansion of Functions
for inline function expansion
cpu_dispatch
cpu_dispatch, cpu_specific
attribute
cpu_specific
cpu_dispatch, cpu_specific
attribute
create libraries using IPO
creating
Creating a New Project
projects
creating a new project
in Eclipse*
data format
Programming with Auto-parallelization
High-Level Optimization (HLO)
Explicit Vector Programming
Vectorization
Automatic Vectorization Overview
Automatic Vectorization
partitioning
prefetching
type
Explicit Vector Programming
Vectorization
Automatic Vectorization Overview
Automatic Vectorization
data types
Using Efficient Data Types
efficiency
dataflow analysis
DAZ flag
debug information
Linking Debug Information
in program database file
option generating full
option generating in DWARF 2 format
option generating in DWARF 3 format
option generating in DWARF 4 format
option generating levels of
debugging
denormal exceptions
denormal numbers
denormalized numbers (IEEE*)
Special Values
NaN values
denormals
deploying applications
diagnostics
dialog boxes
Options: Intel Libraries for oneAPI dialog box
Options: Compilers dialog box
Intel® Performance Libraries
Options: Compilers
Options: Intel® Performance Libraries
directory
isystem
B
option adding to start of include path
option specifying for executables
option specifying for includes and libraries
directory paths
Specifying Directory Paths
in Microsoft Visual Studio*
disabling
Compiler Directed Inline Expansion of Functions
inlining
distribute_point
distributing applications
DO constructs
documentation
Notational Conventions
conventions for
driver tool commands
v
option specifying to show and execute
DWARF debug information
gsplit-dwarf
option creating object file containing
dynamic shared object
shared
option producing a
dynamic-link libraries (DLLs)
MD
option searching for unresolved references in
ebp register
fomit-frame-pointer
option determining use in optimizations
Eclipse*
Creating a New Project
Global Symbols and Visibility Attributes (Linux*
Using Intel Libraries for oneAPI with Eclipse*
Using Eclipse* (Linux*)
Multi-Version Compiler Support
Running a Project
Specifying Symbol Visibility Explicitly (Linux*)
creating a new project
global symbols
integration
Using Intel Libraries for oneAPI with Eclipse*
adding the compiler
building a project
creating a new project
global symbols
multi-version compiler support
running a project
setting options
visibility declaration attribute
integration overview
projects
Multi-Version Compiler Support
multi-version compiler support
running a project
in Eclipse*
using Intel® Performance Libraries
visibility declaration attribute
Eclipse* integration
Adding the Compiler to Eclipse*
Using Eclipse* (Linux*)
building a project
efficiency
efficient
Compiler Directed Inline Expansion of Functions
inlining
efficient data types
EMMS Instruction
The EMMS Instruction: Why You Need It
EMMS Usage Guidelines
about
using
endian data
Loop Constructs
loop constructs
Enter index keyword
svrng_leapfrog_engine
svrng_copy_engine
svrng_generate[1|2|4|8|16|32]_[int|float|double]
svrng_delete_distribution
svrng_new_uniform_distribution_[int|float|double]/svrng_update_uniform_distribution_[int|float|double]
svrng_skipahead_engine
svrng_new_mcg31m1_engine/svrng_new_mcg31m1_ex
svrng_new_rand_engine/svrng_new_rand_ex
svrng_new_mt19937_engine/svrng_new_mt19937_ex
svrng_new_rand0_engine/svrng_new_rand0_ex
svrng_set_status
svrng_generate[1|2|4|8|16|32]_[uint|ulong]
svrng_new_mcg59_engine/svrng_new_mcg59_ex
svrng_delete_engine
svrng_get_status
svrng_new_normal_distribution_[float|double]/svrng_update_normal_distribution_[float|double]
Enter index keyword
Use a Third-Party Compiler as a Host Compiler for DPC++ Code
Distribution Initialization and Finalization
Parallel Computation Support
Random Values Generation
Ahead of Time Compilation
Error Handling
Notices and Disclaimers
Intel® oneAPI DPC++ Compiler Developer Guide and Reference (Beta)
Usage Model
Data Types and Calling Conventions
Service Routines
enums
environment variables
Supported Environment Variables
Linux*
run-time
setting
setting with setvars file
Windows*
environment variables
Managing Libraries
LD_LIBRARY_PATH
error messages
examples
Example for aio_cancel Function
Example for aio_error and aio_return Functions
Example for aio_suspend Function
Example for lio_listio Function
aio_cancel()
aio_error()
aio_return()
aio_suspend()
lio_listio()
exception handling
fexceptions
option generating table of
execution flow
explicit vector programming
Explicit Vector Programming
array notations
elemental functions
smid
extended control registers
Overview: Intrinsics for Managing Extended Processor States and Registers
Intrinsics for Reading and Writing the Content of Extended Control Registers
managing
reading
writing
extended processor states
Overview: Intrinsics for Managing Extended Processor States and Registers
managing
extensions
extensions
C and C++ Standard Libraries Support
Queue Order Properties
Sub-groups for NDRange Parallelism
SYCL_INTEL_unnamed_kernel_lambda
Unified Shared Memory
C and C++ standard libraries support
queue order properties
sub-groups for NDRange parallelism
SYCL_INTEL_unnamed_kernel_lambda
unified shared memory
feature requirements
float64 vector intrinsics
Double-precision Floating-point Vector Intrinsics
Intel® Streaming SIMD Extensions 3
floating-point array operation
Floating-point array: Handling
floating-point exceptions
Reducing the Impact of Denormal Exceptions
denormal exceptions
floating-point numbers
Special Values
special values
forceinline
format function security problems
Wformat-security
option issuing warning for
FP comparison operations
Intrinsics for FP Comparison Operations
_mm_comi_round_sd
_mm_comi_round_ss
_mm[_mask]_cmp_round_sd_mask
_mm[_mask]_cmp_round_ss_mask
_mm[_mask]_cmp_sd_mask
_mm[_mask]_cmp_ss_mask
_mm512[_mask]_cmp_epi64_mask
_mm512[_mask]_cmp_epu64_mask
_mm512[_mask]_cmp_round_pd_mask
_mm512[_mask]_cmp_round_ps_mask
frame pointer
momit-leaf-frame-pointer
option affecting leaf functions
FTZ flag
Function annotations
Function Annotations and the SIMD Directive for Vectorization
__declspec(align)
__declspec(vector)
function expansion
function pointers
SIMD-Enabled Function Pointers
SIMD-enabled
function preemption
g++* language extensions
gcc C++ run-time libraries
idirafter
X
include file path
option adding a directory to second
option removing standard directories from
gcc-compatible warning options
gcc* compatibility
gcc* considerations
gcc* interoperability
gcc* language extensions
general compiler directives
Programming with Auto-parallelization
Inline Expansion of Functions
Programming Guidelines for Vectorization
for auto-parallelization
for inlining functions
for vectorization
global symbols
GNU C++ compatibility
half-float conversion
hardware lock elision
Intrinsics for Hardware Lock Elision Operations
Intrinsics for Intel® Transactional Synchronization Extensions (Intel® TSX)
help
Getting Help and Support
using in Microsoft Visual Studio*
high performance programming
High-Level Optimization (HLO)
applications for
high-level optimizer
HLO
IA-32 architecture based applications
High-Level Optimization (HLO)
HLO
IEEE Standard for Floating-Point Arithmetic, IEEE 754-2008
IEEE*
Special Values
floating-point values
include files
inline
inlining
Inline Expansion of Functions
Developer Directed Inline Expansion of User Functions
Compiler Directed Inline Expansion of Functions
compiler directed
developer directed
preemption
input files
integer comparison operations
integer vector intrinsics
Integer Vector Intrinsic
Intel® Streaming SIMD Extensions 3
integrating Intel® C++ with Microsoft* Visual Studio*
intel_omp_task
intel_omp_taskq
Intel's C++ asynchronous I/O template class
Example for Using async_class Template Class
Usage Example
Intel's Memory Allocator Library
Intel's Numeric String Conversion Library
Overview: Intel's Numeric String Conversion Library
libistrconv
Function List
Intel's Numeric String Conversion Library
Intel® 64 architecture based applications
High-Level Optimization (HLO)
HLO
Intel® IPP libraries
ipp-link, Qipp-link
ipp, Qipp
option letting you choose the library to link to
option letting you link to
Intel® linking tools
Intel® MKL
mkl, Qmkl
option letting you link to libraries
Intel® TBB libraries
tbb, Qtbb
option letting you link to
Intel® AVX
Intel® AVX Intrinsic
_mm256_stream_si256
_mm256_stream_si256 (VMOVNTDQ)
Intel® AVX-512
Intrinsics for Integer Comparison Operations
Intrinsics for Integer Reduction Operations
comparison operations
reduction operations
Intel® C++
Using the Command Line on Windows*
command-line environment
Intel® C++ Class Libraries
overview
Intel® C++ Compiler command prompt window
Intel® extension environment variables
Intel® Hyper-Threading Technology
Enabling Further Loop Parallelization for Multicore Platforms
parallel loops
thread pools
Intel® IEEE 754-2008 Binary Floating-Point Conformance Library
Overview: Intel® IEEE 754-2008 Binary Floating-Point Conformance Library
formatOf general-computational operations
General-Computational Operations Functions
from_string
to_int64_floor
to_int64_rninta
to_int64_xceil
to_int64_xfloor
to_uint32_int
to_uint32_rninta
to_uint32_xceil
to_uint32_xfloor
add
binary32_to_binary64
binary64_to_binary32
div
fma
from_hexstring
from_int32
from_int64
from_uint32
from_uint64
mul
sqrt
sub
to_hexstring
to_int32_ceil
to_int32_floor
to_int32_int
to_int32_rnint
to_int32_rninta
to_int32_xceil
to_int32_xfloor
to_int32_xint
to_int32_xrnint
to_int32_xrninta
to_int64_ceil
to_int64_int
to_int64_rnint
to_int64_xint
to_int64_xrnint
to_int64_xrninta
to_string
to_uint32_ceil
to_uint32_floor
to_uint32_rnint
to_uint32_xint
to_uint32_xrnint
to_uint32_xrninta
to_uint64_ceil
to_uint64_floor
to_uint64_int
to_uint64_rnint
to_uint64_rninta
to_uint64_xceil
to_uint64_xfloor
to_uint64_xint
to_uint64_xrnint
to_uint64_xrninta
homogeneous general-computational operations
homogeneous general-computational operations
Homogeneous General-Computational Operations Functions
ilogb
maxnum
maxnum_mag
minnum
minnum_mag
next_down
next_up
rem
round_integral_exact
round_integral_nearest_away
round_integral_nearest_even
round_integral_negative
round_integral_positive
round_integral_zero
scalbn
non-computational operations
Function List
isNaN
isNormal
isSubnormal
lowerFlags
restoreFlags
testFlags
testSavedFlags
totalOrderMag
class
defaultMode
getBinaryRoundingDirection
is754version1985
is754version2008
isCanonical
isFinite
isInfinite
isSignaling
isSignMinus
isZero
radix
raiseFlags
restoreModes
saveFlags
setBinaryRoundingDirectionsaveModes
totalOrder
nonhomogeneous general-computational operations
quiet-computational operations
Function List
copy
negate
copysign
signaling-computational operations
Function List
signaling_greater_equal
quiet_equal
quiet_greater
quiet_greater_equal
quiet_greater_unordered
quiet_less
quiet_less_equal
quiet_less_unordered
quiet_not_equal
quiet_not_greater
quiet_not_less
quiet_ordered
quiet_unordered
signaling_equal
signaling_greater
signaling_greater_unordered
signaling_less
signaling_less_ unordered
signaling_less_equal
signaling_not_equal
signaling_not_greater
signaling_not_less
using the library
Intel® Integrated Performance Primitives
Using Intel Intel Libraries for oneAPI with Microsoft Visual Studio*
Changing the Selected Intel Libraries for oneAPI
Intel® Math Kernel Library
Using Intel Intel Libraries for oneAPI with Microsoft Visual Studio*
Changing the Selected Intel Libraries for oneAPI
Intel® Math Library
Other Considerations
C99 macros
fpclassify
isfinite
isgreater
isgreaterequal
isinf
isless
islessequal
islessgreater
isnan
isnormal
isunordered
signbit
Intel® Performance Libraries
Using Intel Intel Libraries for oneAPI with Microsoft Visual Studio*
Changing the Selected Intel Libraries for oneAPI
Intel® Integrated Performance Primitives (Intel® IPP)
Using Intel Intel Libraries for oneAPI with Microsoft Visual Studio*
Changing the Selected Intel Libraries for oneAPI
Intel® Math Kernel Library (Intel® MKL)
Using Intel Intel Libraries for oneAPI with Microsoft Visual Studio*
Changing the Selected Intel Libraries for oneAPI
Intel® Threading Building Blocks (Intel® TBB)
Using Intel Intel Libraries for oneAPI with Microsoft Visual Studio*
Changing the Selected Intel Libraries for oneAPI
Intel® SSE4 intrinsics
Application Targeted Accelerators Intrinsics
Floating Point Dot Product Intrinsics
application targeted accelerator intrinsics
intrinsics
Intel® Streaming SIMD Extensions
Cacheability Support Intrinsics
Compare Intrinsics
Conversion Intrinsics
Details about Intel® Streaming SIMD Extensions Intrinsics
Integer Intrinsics
Load Intrinsics
Logical Intrinsics
Macro Function for Matrix Transposition
Macro Functions to Read and Write Control Registers
Macro Function for Shuffle Operations
Miscellaneous Intrinsics
Overview: Intel® Streaming SIMD Extensions (Intel® SSE)
Writing Programs with Intel® Streaming SIMD Extensions (Intel® SSE) Intrinsics
Set Intrinsics
Store Intrinsics
cacheability support operations
compare operations
conversion operations
data types
integer operations
load operations
logical operations
macro functions
Macro Function for Matrix Transposition
Macro Functions to Read and Write Control Registers
Macro Function for Shuffle Operations
matrix transposition
shuffle function
Macro Functions to Read and Write Control Registers
Macro Function for Shuffle Operations
miscellaneous operations
overview
programming with Intel® SSE intrinsics
registers
set operations
store operations
Intel® Streaming SIMD Extensions (Intel® SSE)
Intel® Streaming SIMD Extensions 2
Cacheability Support Intrinsics
Casting Support Intrinsics
Arithmetic Intrinsics
Compare Intrinsics
Conversion Intrinsics
Load Intrinsics
Logical Intrinsics
Set Intrinsics
Store Intrinsics
Arithmetic Intrinsics
Compare Intrinsics
Conversion Intrinsics
Load Intrinsics
Logical Intrinsics
Move Intrinsics
Set Intrinsics
Shift Intrinsics
Store Intrinsics
Miscellaneous Intrinsics
Overview: Intel® Streaming SIMD Extensions 2 (Intel® SSE2)
Pause Intrinsic
Macro Function for Shuffle
cacheability support intrinsics
casting support intrinsics
FP arithmetic intrinsics
FP compare intrinsics
FP conversion intrinsics
FP load intrinsics
FP logical intrinsics
FP set intrinsics
FP store intrinsics
integer arithmetic intrinsics
integer compare intrinsics
integer conversion intrinsics
integer load intrinsics
integer logical intrinsics
integer move intrinsics
integer set intrinsics
integer shift intrinsics
integer store intrinsics
miscellaneous intrinsics
overview
pause intrinsic
shuffle macro
Intel® Streaming SIMD Extensions 2
Macro Functions
macro functions
Intel® Streaming SIMD Extensions 3
Overview: Intel® Streaming SIMD Extensions 3 (Intel® SSE3)
overview
Intel® Streaming SIMD Extensions 3
Macro Functions
macro functions
Intel® Streaming SIMD Extensions 4
Application Targeted Accelerators Intrinsics
Cacheability Support Intrinsic
Floating Point Rounding Intrinsics
Floating Point Dot Product Intrinsics
Packed Blending Intrinsics
Packed Compare for Equal Intrinsic
Packed Compare Intrinsics
Packed DWORD to Unsigned WORD Intrinsic
Packed Format Conversion Intrinsics
Packed Integer Min/Max Intrinsics
Register Insertion/Extraction Intrinsics
Test Intrinsics
DWORD Multiply Intrinsics
application targeted accelerator intrinsics
cacheability support intrinsic
floating-point rounding intrinsics
FP dot product intrinsics
packed blending intrinsics
packed compare for equal intrinsic
packed compare intrinsics
packed DWORD to unsigned WORD intrinsic
packed format conversion intrinsics
packed integer min/max intrinsics
register insertion/extraction intrinsics
test intrinsics
Test Intrinsics
DWORD Multiply Intrinsics
Intel® Streaming SIMD Extensions4
Overview: Intel® Streaming SIMD Extensions 4 (Intel® SSE4)
overview
Intel® Threading Building Blocks
Using Intel Intel Libraries for oneAPI with Microsoft Visual Studio*
Changing the Selected Intel Libraries for oneAPI
intermediate files
save-temps
option saving during compilation
intermediate representation (IR)
Using IPO
Interprocedural Optimization (IPO)
interoperability
GCC* Compatibility and Interoperability
with g++*
with gcc*
interprocedural optimizations
Compiler Directed Inline Expansion of Functions
capturing intermediate output
code layout
compilation
compiling
considerations
creating libraries
issues
large programs
linking
Using IPO
Interprocedural Optimization (IPO)
option enabling between files
option enabling for single file compilation
overview
performance
using
whole program analysis
xiar
xild
xilibtool
intrinsics
Intrinsics Returning Vectors of Undefined Values
_rdrand16_step(), _rdrand32_step(), _rdrand64_step()
Intrinsics that Allow Reading from and Writing to the FS Base and GS Base Registers
Intrinsics for Converting Half Floats that Map to 3rd Generation Intel® Core™ Processor Instructions
_mm_cvtph_ps()
_mm_cvtps_ph()
_mm256_cvtph_ps()
_mm256_cvtps_ph()
Overview: Intrinsics for 3rd Generation Intel® Core™ Processor Instruction Extensions
Intrinsics that Generate Random Numbers of 16/32/64 Bit Wide Random Integers
_addcarry_u32(), _addcarry_u64()
_addcarryx_u32(), _addcarryx_u64()
_subborrow_u32(), _subborrow_u64()
Intrinsics for Multi-Precision Arithmetic
Overview: Intrinsics for 4th Generation Intel® Core™ Processor Instruction Extensions
Intrinsics
Intrinsics for Carry-less Multiplication Instruction and Advanced Encryption Standard Instructions
Overview: Intrinsics for Carry-less Multiplication Instruction and Advanced Encryption Standard Instructions
String and Block Copy Intrinsics
Floating-point Intrinsics
Integer Arithmetic Intrinsics
Miscellaneous Intrinsics
Overview: Intrinsics across Intel® Architectures
Alignment Support
Overview: Data Alignment, Memory Allocation Intrinsics, and Inline Assembly
Details about Intrinsics
Intrinsics for Saving and Restoring the Extended Processor States
Intrinsics for Reading and Writing the Content of Extended Control Registers
Intrinsics for Managing Extended Processor States and Registers
Intrinsics for Converting Half Floats
Overview: Intrinsics to Convert Half Float Types
Inline Assembly
Intrinsics for Intel® Advanced Vector Extensions
Intrinsics for Intel® Advanced Vector Extensions 2
_mm256_hadd_ps
_mm256_addsub_pd
_mm256_addsub_ps
_mm256_div_pd
_mm256_div_ps
_mm256_dp_ps
_mm256_hadd_pd
_mm256_hsub_pd
_mm256_hsub_ps
_mm256_mul_pd
_mm256_mul_ps
_mm256_rcp_ps
_mm256_rsqrt_ps
_mm256_sqrt_pd
_mm256_sqrt_ps
Intrinsics for Arithmetic Operations
Intrinsics for Bitwise Operations
_mm256_and_pd
_mm256_and_ps
_mm256_andnot_pd
_mm256_andnot_ps
_mm256_or_pd
_mm256_or_ps
_mm256_xor_pd
_mm256_xor_ps
Intrinsics for Blend and Conditional Merge Operations
Intrinsics for Compare Operations
Intrinsics for Conversion Operations
Intrinsics for Load and Store Operations
Intrinsics to Determine Minimum and Maximum Values
Intrinsics for Miscellaneous Operations
_mm256_undefined_pd()
_mm256_undefined_ps()
_mm256_undefined_si256
_mm256_max_pd
_mm256_max_ps
_mm256_min_pd
_mm256_min_ps
Details of Intel® Advanced Vector Extensions Intrinsics
Overview: Intrinsics for Intel® Advanced Vector Extensions Instructions
Intrinsics for Packed Test Operations
Intrinsics for Permute Operations
Intrinsics for Shuffle Operations
Intrinsics for Unpack and Interleave Operations
Intrinsics Generating Vectors of Undefined Values
Support Intrinsics for Vector Typecasting Operations
Intrinsics for Arithmetic Operations
Intrinsics for Arithmetic Shift Operations
Intrinsics for Operations to Manipulate Integer Data at Bit-Granularity
Intrinsics for Bitwise Operations
Intrinsics for Blend Operations
Intrinsics for Broadcast Operations
Intrinsics for Compare Operations
Intrinsics for Fused Multiply Add Operations
Intrinsics for GATHER Operations
Intrinsics for Insert/Extract Operations
Intrinsics for Masked Load/Store Operations
Intrinsics for Logical Shift Operations
_mm_maskload_epi32/64, _mm256_maskload_epi32/64
_mm_maskstore_epi32/64, _mm256_maskstore_epi32/64
Intrinsics for Miscellaneous Operations
Intrinsics for Pack/Unpack Operations
Intrinsics for Packed Move with Extend Operations
Intrinsics for Permute Operations
Intrinsics for Shuffle Operations
Intrinsics for Intel® Transactional Synchronization Extensions (Intel® TSX)
Arithmetic Intrinsics
Cacheability Support Intrinsics
Compare Intrinsics
Conversion Intrinsics
Details about Intel® Streaming SIMD Extensions Intrinsics
Integer Intrinsics
Load Intrinsics
Logical Intrinsics
Macro Function for Matrix Transposition
Macro Functions to Read and Write Control Registers
Macro Function for Shuffle Operations
Miscellaneous Intrinsics
Overview: Intel® Streaming SIMD Extensions (Intel® SSE)
Writing Programs with Intel® Streaming SIMD Extensions (Intel® SSE) Intrinsics
Intrinsics to Read and Write Registers
Set Intrinsics
Store Intrinsics
Cacheability Support Intrinsics
Casting Support Intrinsics
Arithmetic Intrinsics
Compare Intrinsics
Conversion Intrinsics
Load Intrinsics
Logical Intrinsics
Set Intrinsics
Store Intrinsics
Arithmetic Intrinsics
Compare Intrinsics
Conversion Intrinsics
Load Intrinsics
Logical Intrinsics
Move Intrinsics
Set Intrinsics
Shift Intrinsics
Store Intrinsics
Miscellaneous Intrinsics
Overview: Intel® Streaming SIMD Extensions 2 (Intel® SSE2)
Pause Intrinsic
Macro Function for Shuffle
Single-precision Floating-point Vector Intrinsics
Double-precision Floating-point Vector Intrinsics
Integer Vector Intrinsic
Miscellaneous Intrinsics
Overview: Intel® Streaming SIMD Extensions 3 (Intel® SSE3)
Application Targeted Accelerators Intrinsics
Cacheability Support Intrinsic
DWORD Multiply Intrinsics
Floating Point Rounding Intrinsics
Floating Point Dot Product Intrinsics
Overview: Intel® Streaming SIMD Extensions 4 (Intel® SSE4)
Packed Blending Intrinsics
Packed Compare for Equal Intrinsic
Packed Compare Intrinsics
Packed DWORD to Unsigned WORD Intrinsic
Packed Format Conversion Intrinsics
Packed Integer Min/Max Intrinsics
Register Insertion/Extraction Intrinsics
Test Intrinsics
Intrinsics for Later Generation Intel® Core™ Processor Instruction Extensions
Allocating and Freeing Aligned Memory Blocks
Details about MMX™ Technology Intrinsics
Compare Intrinsics (MMX™ technology)
The EMMS Instruction: Why You Need It
EMMS Usage Guidelines
General Support Intrinsics (MMX™ technology)
Logical Intrinsics (MMX™ technology)
Overview: Intrinsics for MMX™ Technology
Packed Arithmetic Intrinsics (MMX™ technology)
Set Intrinsics (MMX™ technology)
Shift Intrinsics (MMX™ technology)
Naming and Usage Syntax
References
Absolute Value Intrinsics
Addition Intrinsics
Concatenate Intrinsics
Multiplication Intrinsics
Negation Intrinsics
Overview: Supplemental Streaming SIMD Extensions 3 (SSSE3)
Shuffle Intrinsics
Subtraction Intrinsics
_mm_cexp_ps, _mm256_cexp_ps
_mm_clog_ps, _mm256_clog_ps
_mm_csqrt_ps, _mm256_csqrt_ps
_mm_cdfnorminv_pd, _mm256_cdfnorminv_pd
_mm_cdfnorminv_ps, _mm256_cdfnorminv_ps
_mm_erf_pd, _mm256_erf_pd
_mm_erf_ps, _mm256_erf_ps
_mm_erfc_pd, _mm256_erfc_pd
_mm_erfc_ps, _mm256_erfc_ps
_mm_erfinv_pd, _mm256_erfinv_pd
_mm_erfinv_ps, _mm256_erfinv_ps
_mm_exp2_pd, _mm256_exp2_pd
_mm_exp2_ps, _mm256_exp2_ps
_mm_hypot_ps, _mm256_hypot_ps
_mm_exp_pd, _mm256_exp_pd
_mm_exp_ps, _mm256_exp_ps
_mm_exp10_pd, _mm256_exp10_pd
_mm_exp10_ps, _mm256_exp10_ps
_mm_expm1_pd, _mm256_expm1_pd
_mm_expm1_ps, _mm256_expm1_ps
_mm_hypot_pd, _mm256_hypot_pd
_mm_pow_pd, _mm256_pow_pd
_mm_pow_ps, _mm256_pow_ps
_mm_log_pd, _mm256_log_pd
_mm_log_ps, _mm256_log_ps
_mm_log10_pd, _mm256_log10_pd
_mm_log10_ps, _mm256_log10_ps
_mm_log1p_pd, _mm256_log1p_pd
_mm_log1p_ps, _mm256_log1p_ps
_mm_log2_pd, _mm256_log2_pd
_mm_log2_ps, _mm256_log2_ps
_mm_logb_pd, _mm256_logb_pd
_mm_logb_ps, _mm256_logb_ps
Overview: Intrinsics for Short Vector Math Library (SVML) Functions
_mm_sqrt_ps, _mm256_sqrt_ps
_mm_cbrt_pd, _mm256_cbrt_pd
_mm_cbrt_ps, _mm256_cbrt_ps
_mm_invcbrt_pd, _mm256_invcbrt_pd
_mm_invcbrt_ps, _mm256_invcbrt_ps
_mm_invsqrt_pd, _mm256_invsqrt_pd
_mm_invsqrt_ps, _mm256_invsqrt_ps
_mm_sqrt_pd, _mm256_sqrt_pd
_mm_sinh_ps, _mm256_sinh_ps
_mm_acos_pd, _mm256_acos_pd
_mm_acos_ps, _mm256_acos_ps
_mm_acosh_pd, _mm256_acosh_pd
_mm_acosh_ps, _mm256_acosh_ps
_mm_asin_pd, _mm256_asin_pd
_mm_asin_ps, _mm256_asin_ps
_mm_asinh_pd, _mm256_asinh_pd
_mm_asinh_ps, _mm256_asinh_ps
_mm_atan_pd, _mm256_atan_pd
_mm_atan_ps, _mm256_atan_ps
_mm_atan2_pd, _mm256_atan2_pd
_mm_atan2_ps, _mm256_atan2_ps
_mm_atanh_pd, _mm256_atanh_pd
_mm_atanh_ps, _mm256_atanh_ps
_mm_cos_pd, _mm256_cos_pd
_mm_cos_ps, _mm256_cos_ps
_mm_cosd_pd, _mm256_cosd_pd
_mm_cosd_ps, _mm256_cosd_ps
_mm_cosh_pd, _mm256_cosh_pd
_mm_cosh_ps, _mm256_cosh_ps
_mm_sin_pd, _mm256_sin_pd
_mm_sin_ps, _mm256_sin_ps
_mm_sincos_pd, _mm256_sincos_pd
_mm_sincos_ps, _mm256_sincos_ps
_mm_sind_pd, _mm256_sind_pd
_mm_sind_ps, _mm256_sind_ps
_mm_sinh_pd, _mm256_sinh_pd
_mm_tan_pd, _mm256_tan_pd
_mm_tan_ps, _mm256_tan_ps
_mm_tand_pd, _mm256_tand_pd
_mm_tand_ps, _mm256_tand_ps
_mm_tanh_pd, _mm256_tanh_pd
_mm_tanh_ps, _mm256_tanh_ps
Intel® SSE2
Intrinsics Returning Vectors of Undefined Values
intrinsics returning vectors of undefined values
Intrinsics Returning Vectors of Undefined Values
_mm_undefined_pd()
_mm_undefined_si128()
3rd Generation Intel® Core™ Processor Instruction Extensions
_rdrand16_step(), _rdrand32_step(), _rdrand64_step()
Intrinsics that Allow Reading from and Writing to the FS Base and GS Base Registers
Intrinsics for Converting Half Floats that Map to 3rd Generation Intel® Core™ Processor Instructions
_mm_cvtph_ps()
_mm_cvtps_ph()
_mm256_cvtph_ps()
_mm256_cvtps_ph()
Overview: Intrinsics for 3rd Generation Intel® Core™ Processor Instruction Extensions
Intrinsics that Generate Random Numbers of 16/32/64 Bit Wide Random Integers
_rdrand16_step()
_rdrand32_step()
_rdrand64_step()
base registers
Intrinsics that Allow Reading from and Writing to the FS Base and GS Base Registers
_readfsbase_u32()
_readfsbase_u64()
_readgsbase_u32()
_readgsbase_u64()
_writefsbase_u32()
_writefsbase_u64()
_writegsbase_u32()
_writegsbase_u64()
half-float
Intrinsics for Converting Half Floats that Map to 3rd Generation Intel® Core™ Processor Instructions
_mm_cvtph_ps()
_mm_cvtps_ph()
_mm256_cvtph_ps()
_mm256_cvtps_ph()
conversion
Intrinsics for Converting Half Floats that Map to 3rd Generation Intel® Core™ Processor Instructions
_mm_cvtph_ps()
_mm_cvtps_ph()
_mm256_cvtph_ps()
_mm256_cvtps_ph()
_mm_cvtph_ps()
Intrinsics for Converting Half Floats that Map to 3rd Generation Intel® Core™ Processor Instructions
_mm_cvtph_ps()
_mm_cvtps_ph()
_mm_cvtps_ph())
_mm256_cvtph_ps()
Intrinsics for Converting Half Floats that Map to 3rd Generation Intel® Core™ Processor Instructions
_mm256_cvtph_ps()
_mm256_cvtps_ph()
_mm256_cvtps_ph()
Intrinsics for Converting Half Floats that Map to 3rd Generation Intel® Core™ Processor Instructions
overview
random number generation (RDRAND)
4th Generation Intel® Core™ Processor Instruction Extensions
_addcarry_u32(), _addcarry_u64()
_addcarryx_u32(), _addcarryx_u64()
_subborrow_u32(), _subborrow_u64()
Intrinsics for Multi-Precision Arithmetic
Overview: Intrinsics for 4th Generation Intel® Core™ Processor Instruction Extensions
Intrinsics that Generate Random Numbers of 16/32/64 Bit Wide Random Integers
_addcarry_u32()
_addcarry_u64()
_addcarryx_u32()
_addcarryx_u64()
_subborrow_u32()
_subborrow_u64()
Multi-Precision Arithmetic
overview
random number generation (RDSEED)
about
Advanced Encryption Standard (AES) Implementation
Intrinsics for Carry-less Multiplication Instruction and Advanced Encryption Standard Instructions
Overview: Intrinsics for Carry-less Multiplication Instruction and Advanced Encryption Standard Instructions
_mm_aesdec_si128
_mm_aesdeclast_si128
_mm_aesenc_si128
_mm_aesenclast_si128
_mm_aesimc_si128
_mm_aeskeygenassist_si128
overview
All Intel Architectures
String and Block Copy Intrinsics
string and block copy operations
All Intel® Architectures
Floating-point Intrinsics
Integer Arithmetic Intrinsics
Miscellaneous Intrinsics
Overview: Intrinsics across Intel® Architectures
floating point operations
integer arithmetic operations
miscellaneous operations
Miscellaneous Intrinsics
_BitScanForward
_BitScanReverse
_bittest
_bittestandreset
_bittestandset
-bittestandcomplement
overview
carry-less multiplication instruction
Overview: Intrinsics for Carry-less Multiplication Instruction and Advanced Encryption Standard Instructions
_mm_clmulepi64_si128
data alignment
Alignment Support
Overview: Data Alignment, Memory Allocation Intrinsics, and Inline Assembly
data types
extended processor states
Intrinsics for Saving and Restoring the Extended Processor States
restoring
saving
for managing extended processor states and registers
Intrinsics for Reading and Writing the Content of Extended Control Registers
Intrinsics for Managing Extended Processor States and Registers
_fxrstor()
_fxrstor64()
_fxsave()
_fxsave64()
_xgetbv()
_xrstor()
_xrstor64()
_xrstors()
_xrstors64()
_xsave()
_xsave64()
_xsavec()
_xsavec64()
_xsaveopt()
_xsaveopt64()
_xsaves()
_xsaves64()
_xsetbv()
restoring extended processor states
saving extended processor states
half-float conversion
Intrinsics for Converting Half Floats
Overview: Intrinsics to Convert Half Float Types
_cvtsh_ss
_cvtss_sh
_mm_cvtph_ps
_mm_cvtps_ph
overview
inline assembly
Inline Assembly
Overview: Data Alignment, Memory Allocation Intrinsics, and Inline Assembly
Intel® Advanced Vector Extensions (AVX)
Intel® Advanced Vector Extensions 2 (Intel® AVX2)
Intrinsics for Intel® Advanced Vector Extensions 2
overview
Intel® AVX
_mm256_hadd_ps
_mm256_addsub_pd
_mm256_addsub_ps
_mm256_div_pd
_mm256_div_ps
_mm256_dp_ps
_mm256_hadd_pd
_mm256_hsub_pd
_mm256_hsub_ps
_mm256_mul_pd
_mm256_mul_ps
_mm256_rcp_ps
_mm256_rsqrt_ps
_mm256_sqrt_pd
_mm256_sqrt_ps
Intrinsics for Arithmetic Operations
Intrinsics for Bitwise Operations
_mm256_and_pd
_mm256_and_ps
_mm256_andnot_pd
_mm256_andnot_ps
_mm256_or_pd
_mm256_or_ps
_mm256_xor_pd
_mm256_xor_ps
Intrinsics for Blend and Conditional Merge Operations
Intrinsics for Compare Operations
Intrinsics for Conversion Operations
Intrinsics for Load and Store Operations
Intrinsics to Determine Minimum and Maximum Values
Intrinsics for Miscellaneous Operations
_mm256_undefined_pd()
_mm256_undefined_ps()
_mm256_undefined_si256
_mm256_max_pd
_mm256_max_ps
_mm256_min_pd
_mm256_min_ps
Details of Intel® Advanced Vector Extensions Intrinsics
Overview: Intrinsics for Intel® Advanced Vector Extensions Instructions
Intrinsics for Packed Test Operations
Intrinsics for Permute Operations
Intrinsics for Shuffle Operations
Intrinsics for Unpack and Interleave Operations
Intrinsics Generating Vectors of Undefined Values
Support Intrinsics for Vector Typecasting Operations
arithmetic intrinsics
_mm256_hadd_ps
_mm256_addsub_pd
_mm256_addsub_ps
_mm256_div_pd
_mm256_div_ps
_mm256_dp_ps
_mm256_hadd_pd
_mm256_hsub_pd
_mm256_hsub_ps
_mm256_mul_pd
_mm256_mul_ps
_mm256_rcp_ps
_mm256_rsqrt_ps
_mm256_sqrt_pd
_mm256_sqrt_ps
_mm256_hadd_ps
_mm256_addsub_pd (VADDSUBPD)
_mm256_addsub_ps (VADDSUBPS)
_mm256_div_pd (VDIVPD)
_mm256_div_ps (VDIVPS)
_mm256_dp_ps (VDPPS)
_mm256_hadd_pd (VHADDPD)
_mm256_hsub_pd (VHSUBPD)
_mm256_hsub_ps (VHSUBPS)
_mm256_mul_pd (VMULPD)
_mm256_mul_ps (VMULPS)
_mm256_rcp_pd (VRCPPS)
_mm256_rsqrt_ps (VRSQRTPS)
_mm256_sqrt_pd (VSQRTPD)
_mm256_sqrt_ps (VSQRTPS)
arithmetic operations
Intrinsics for Arithmetic Operations
_mm256_add_pd (VADDPD)
_mm256_add_ps (VADDPS)
_mm256_sub_pd (VSUBPD)
_mm256_sub_ps (VSUBPS)
bitwise logical operations
bitwise operations
_mm256_and_pd
_mm256_and_ps
_mm256_andnot_pd
_mm256_andnot_ps
_mm256_or_pd
_mm256_or_ps
_mm256_xor_pd
_mm256_xor_ps
__mm256_and_pd (VANDPD)
_mm256_and_ps (VANDPS)
_mm256_andnot_pd (VANDNPD)
_mm256_andnot_ps (VANDNPS)
_mm256_or_pd (VORPD)
_mm256_or_ps (VORPS)
_mm256_xor_pd (VXORPD)
_mm256_xor_ps (VXORPS)
blend and conditional merge operations
Intrinsics for Blend and Conditional Merge Operations
_mm256_blendv_ps (VBLENDVPS)
_mm256_blend_pd (VBLENDPD)
_mm256_blend_ps (VBLENDPS)
_mm256_blendv_pd (VBLENDVPD)
compare operations
Intrinsics for Compare Operations
_m256_cmp_pd (VCMPPD)
_mm_cmp_pd (VCMPPD)
_mm_cmp_ps (VCMPPS)
_mm_cmp_sd (VCMPSD)
_mm_cmp_ss (VCMPSS)
_mm256_cmp_ps (VCMPPS)
conversion operations
Intrinsics for Conversion Operations
_mm256_cvtepi32_pd (VCVTDQ2PD)
_mm256_cvtepi32_ps (VCVTDQ2PS)
_mm256_cvtpd_epi32 (VCVTPD2DQ)
_mm256_cvtpd_ps (VCVTPD2PS)
_mm256_cvtps_epi32 (VCVTPS2DQ)
_mm256_cvtps_pd (VCVTPS2PD)
_mm256_cvtsd_f64 (vmovsd)
_mm256_cvtss_f32 (vmovss)
_mm256_cvttpd_epi32 (VCVTTPD2DQ)
_mm256_cvttps_epi32 (VCVTTPS2DQ)
_mm256_cvtsi256_si32
_mm256_cvttps_epi32
load operations
Intrinsics for Load and Store Operations
_mm_broadcast_ss (VBROADCASTSS)
_mm_maskload_pd (VMASKMOVPD)
_mm_maskload_ps (VMASKMOVPS)
_mm_maskstore_pd (VMASKMOVPD)
_mm_maskstore_ps (VMASKMOVPS)
_mm256_add_ps (VMASKMOVPS)
_mm256_broadcast_pd (VBROADCASTF128)
_mm256_broadcast_ps (VBROADCASTF128)
_mm256_broadcast_sd (VBROADCASTSD)
_mm256_broadcast_ss (VBROADCASTSS)
_mm256_load_pd (VMOVAPD)
_mm256_load_ps (VMOVAPS)
_mm256_load_si256 (VMOVDQA)
_mm256_loadu_pd (VMOVUPD)
_mm256_loadu_ps (VMOVUPS)
_mm256_loadu_si256 (VMOVDQU)
_mm256_maskload_pd (VMASKMOVPD)
_mm256_maskstore_pd (VMASKMOVPD)
_mm256_maskstore_ps (VMASKMOVPS)
_mm256_store_pd (VMOVAPD)
_mm256_store_ps (VMOVAPS)
_mm256_store_si256 (VMOVDQA)
_mm256_storeu_pd (VMOVUPD)
_mm256_storeu_ps (VMOVUPS)
_mm256_storeu_si256 (VMOVDQU)
_mm256_stream_pd (VMOVNTPD)
_mm256_stream_ps (VMOVNTPS)
minimum and maximum operations
miscellaneous operations
Intrinsics for Miscellaneous Operations
_mm256_extractf128_pd (VEXTRACTF128)
_mm256_extractf128_ps (VEXTRACTF128)
_mm256_extractf128_si256 (VEXTRACTF128)
_mm256_insertf128_pd (VINSERTF128)
_mm256_insertf128_ps (VINSERTF128)
_mm256_insertf128_si256 (VINSERTF128)
_mm256_lddqu_si256 (VLDDQU)
_mm256_movedup_pd (VMOVDDUP)
_mm256_movehdup_ps (VMOVSHDUP)
_mm256_moveldup_ps (VMOVSLDUP)
_mm256_movemask_pd (VMOVMSKPD)
_mm256_movemask_ps (VMOVMSKPS)
_mm256_round_pd (VROUNDPD)
_mm256_round_ps (VROUNDPS)
_mm256_set_epi16
_mm256_set_epi32
_mm256_set_epi64x
_mm256_set_epi8
_mm256_set_pd
_mm256_set_ps
_mm256_set1_epi16
_mm256_set1_epi32
_mm256_set1_epi64x
_mm256_set1_epi8
_mm256_set1_pd
_mm256_set1_ps
_mm256_setr_epi16
_mm256_setr_epi32
_mm256_setr_epi64x
_mm256_setr_epi8
_mm256_setr_pd
_mm256_setr_ps
_mm256_setzero_pd
_mm256_setzero_ps
_mm256_setzero_si256
_mm256_zeroall (VZEROALL)
_mm256_zeroupper (VZEROUPPER)
operations returning vectors of undefined values
_mm256_undefined_pd()
_mm256_undefined_ps()
_mm256_undefined_si256
_mm256_undefined_pd()
_mm256_undefined_ps()
_mm256_undefined_si128
operations to determine maximum value
_mm256_max_pd
_mm256_max_ps
_mm256_max_pd (VMAXPD)
_mm256_max_ps (VMAXPS)
operations to determine minimum value
_mm256_min_pd
_mm256_min_ps
_mm256_min_pd (VMINPD)
_mm256_min_ps (VMINPS)
overview
Details of Intel® Advanced Vector Extensions Intrinsics
Overview: Intrinsics for Intel® Advanced Vector Extensions Instructions
packed test operations
Intrinsics for Packed Test Operations
_mm_testc_pd (VTESTPD)
_mm_testc_ps (VTESTPS)
_mm_testnzc_pd (VTESTPD)
_mm_testnzc_ps (VTESTPS)
_mm_testz_pd (VTESTPD)
_mm_testz_ps (VTESTPS)
_mm256_testc_pd (VTESTPD)
_mm256_testc_ps (VTESTPS)
_mm256_testc_si256 (VPTEST)
_mm256_testnzc_pd (VTESTPD)
_mm256_testnzc_ps (VTESTPS)
_mm256_testnzc_si256 (VPTEST)
_mm256_testz_pd (VTESTPD)
_mm256_testz_ps (VTESTPS)
_mm256_testz_si256 (VPTEST)
permute operations
Intrinsics for Permute Operations
_mm_permute_pd (VPERMILPD)
_mm_permute_ps (VPERMILPS)
_mm_permutevar_pd (VPERMILPD)
_mm_permutevar_ps (VPERMILPS)
_mm256_permute_pd (VPERMILPD)
_mm256_permute_ps (VPERMILPS)
_mm256_permute2f128_pd (VPERM2F128)
_mm256_permute2f128_ps (VPERM2F128)
_mm256_permute2f128_si256 (VPERM2F128)
_mm256_permutevar_pd (VPERMILPD)
_mm256_permutevar_ps (VPERMILPS)
shuffle operations
Intrinsics for Shuffle Operations
_mm256_shuffle_pd (VSHUFPD)
_mm256_shuffle_ps (VSHUFPS)
unpack and interleave operations
Intrinsics for Unpack and Interleave Operations
_mm256_unpackhi_pd (VUNPCKHPD)
_mm256_unpackhi_ps (VUNPCKHPS)
_mm256_unpacklo_pd (VUNPCKLPD)
_mm256_unpacklo_ps (VUNPCKLPS)
vector generation operations
vector typecasting operations
Support Intrinsics for Vector Typecasting Operations
_mm256_castpd_ps
_mm256_castpd_si256
_mm256_castpd128_pd256
_mm256_castpd256_pd128
_mm256_castps_pd
_mm256_castps_si256
_mm256_castps128_ps256
_mm256_castps256_ps128
_mm256_castsi128_si256
_mm256_castsi256_pd
_mm256_castsi256_ps
_mm256_castsi256_si128
Intel® AVX2
Intrinsics for Arithmetic Operations
Intrinsics for Arithmetic Shift Operations
Intrinsics for Operations to Manipulate Integer Data at Bit-Granularity
Intrinsics for Bitwise Operations
Intrinsics for Blend Operations
Intrinsics for Broadcast Operations
Intrinsics for Compare Operations
Intrinsics for Fused Multiply Add Operations
Intrinsics for GATHER Operations
Intrinsics for Insert/Extract Operations
Intrinsics for Masked Load/Store Operations
Intrinsics for Logical Shift Operations
_mm_maskload_epi32/64, _mm256_maskload_epi32/64
_mm_maskstore_epi32/64, _mm256_maskstore_epi32/64
Intrinsics for Miscellaneous Operations
Intrinsics for Pack/Unpack Operations
Intrinsics for Packed Move with Extend Operations
Intrinsics for Permute Operations
Intrinsics for Shuffle Operations
Intrinsics for Intel® Transactional Synchronization Extensions (Intel® TSX)
arithmetic operations
Intrinsics for Arithmetic Operations
_mm256_abs_epi16 (VPABSW)
_mm256_abs_epi32 (VPABSD)
_mm256_abs_epi8 (VPABSB)
_mm256_add_epi16 (VPADDW)
_mm256_add_epi32 (VPADDD)
_mm256_add_epi64 (VPADDQ)
_mm256_add_epi8 (VPADDB)
_mm256_adds_epi16 (VPADDSW)
_mm256_adds_epi8 (VPADDSB)
_mm256_adds_epu16 (VPADDUSW)
_mm256_adds_epu8 (VPADDUSB)
_mm256_avg_epu16 (VPAVGW)
_mm256_avg_epu8 (VPAVGB)
_mm256_hadd_epi16 (VPHADDW)
_mm256_hadd_epi32 (VPHADDD)
_mm256_hadds_epi16 (VPHADDSW)
_mm256_hsub_epi16 (VPHSUBW)
_mm256_hsub_epi32 (VPHSUBD)
_mm256_hsubs_epi16 (VPHSUBSW)
_mm256_madd_epi16 (VPMADDW)
_mm256_maddubs_epi16 (VPMADDUBSW)
_mm256_mpsadbw_epu8 (VMPSADBW)
_mm256_mul_epi32 (VPMULDQ)
_mm256_mul_epu32 (VPMULUDQ)
_mm256_mulhi_epi16 (VPMULHW)
_mm256_mulhi_epu16 (VPMULHUW)
_mm256_mulhrs_epi16 (VPMULHRSW)
_mm256_mullo_epi16 (VPMULLW)
_mm256_mullo_epi32 (VPMULLD)
_mm256_sad_epu8 (VPSADBW)
_mm256_sign_epi16 (VPSIGNW)
_mm256_sign_epi32 (VPSIGND)
_mm256_sign_epi8 (VPSIGNB)
_mm256_sub_epi16 (VPSUBW)
_mm256_sub_epi32 (VPSUBD)
_mm256_sub_epi64 (VPSUBQ)
_mm256_sub_epi8 (VPSUBB)
_mm256_subs_epi16 (VPSUBSW)
_mm256_subs_epi8 (VPSUBSB)
_mm256_subs_epu16 (VPSUBUSW)
_mm256_subs_epu8 (VPSUBUSB)
arithmetic shift operations
Intrinsics for Arithmetic Shift Operations
_mm_srav_epi32 (VPSRAVD)
_mm256_sra_epi16 (VPSRAW)
_mm256_sra_epi32 (VPSRAD)
_mm256_srai_epi16 (VPSRAW)
_mm256_srai_epi32 (VPSRAD)
_mm256_srav_epi32 (VPSRAVD)
bit manipulation operations
Intrinsics for Operations to Manipulate Integer Data at Bit-Granularity
_bextr_u32 (BEXTR)
_bextr_u64 (BEXTR)
_blsi_u32 (BLSI)
_blsi_u64 (BLSI)
_blsmsk_u32 (BLSMSK)
_blsmsk_u64 (BLSMSK)
_blsr_u64 (BLSR)
_bslr_u32 (BLSR)
_lzcnt_u32 (LZCNT)
_lzcnt_u32/64
_bzhi_u32/64
_lzcnt_u64 (LZCNT)
_lzcnt_u32/64
_bzhi_u32/64
_pdep_u32 (PDEP)
_pdep_u64 (PDEP)
_pext_u32 (PEXT)
_pext_u64 (PEXT)
_tzcnt_u32 (TZCNT)
_tzcnt_u64 (TZCNT)
bitwise logical operations
Intrinsics for Bitwise Operations
_mm256_xor_si256 (VPXOR)
_mm256_and_si256 (VPAND)
_mm256_andnot_si256 (VPANDN)
_mm256_or_si256 (VPOR)
blend operations
Intrinsics for Blend Operations
_mm_blend_epi32
_mm256_blend_epi16 (VPBLENDW)
_mm256_blend_epi32 (VPBLENDD)
_mm256_blend_epi32 (VPBLENDVB)
broadcast operations
Intrinsics for Broadcast Operations
_mm_broadcastb_epi8 (VPBROADCASTB)
_mm_broadcastd_epi32 (VPBROADCASTD)
_mm_broadcastq_epi64 (VPBROADCASTQ)
_mm_broadcastsd_pd (VBROADCASTSD)
_mm_broadcastss_ps (VBROADCASTSS)
_mm_broadcastw_epi16 (VPBROADCASTW)
_mm256_broadcastb_epi8 (VPBROADCASTB)
_mm256_broadcastd_epi32 (VPBROADCASTD)
_mm256_broadcastq_epi64 (VPBROADCASTQ)
_mm256_broadcastsd_pd (VBROADCASTSD)
_mm256_broadcastsi128_si256 (VBROADCASTI128)
_mm256_broadcastsi128_si256 (VPERM2I128)
_mm256_broadcastss_ps (VBROADCASTSS)
_mm256_broadcastw_epi16 (VPBROADCASTW)
compare operations
Intrinsics for Compare Operations
_mm256_cmpeq_epi16 (VPCMPEQW)
_mm256_cmpeq_epi32 (VPCMPEQD)
_mm256_cmpeq_epi64 (VPCMPEQQ)
_mm256_cmpeq_epi8 (VPCMPEQB)
_mm256_cmpgt_epi16 (VPCMPGTW)
_mm256_cmpgt_epi32 (VPCMPGTD)
_mm256_cmpgt_epi64 (VPCMPGTQ)
_mm256_cmpgt_epi8 (VPCMPGTB)
_mm256_max_epi16 (VPMAXSW)
_mm256_max_epi32 (VPMAXSD)
_mm256_max_epi8 (VPMAXSB)
_mm256_max_epu16 (VPMAXUW)
_mm256_max_epu32 (VPMAXUD)
_mm256_max_epu8 (VPMAXUB)
_mm256_min_epi16 (VPMINSW)
_mm256_min_epi32 (VPMINSD)
_mm256_min_epi8 (VPMINSB)
_mm256_min_epu16 (VPMINUW)
_mm256_min_epu32 (VPMINUD)
_mm256_min_epu8 (VPMINUB)
fused multiply-add (FMA) operations
Intrinsics for Fused Multiply Add Operations
_mm_fmadd_pd (VFMADD###)
_mm_fmadd_ps (VFMADD###)
_mm_fmadd_sd (VFMADD###)
_mm_fmadd_ss (VFMADD###)
_mm_fmaddsub_pd (VFMADDSUB###)
_mm_fmaddsub_ps (VFMADDSUB###)
_mm_fmsub_pd (VFMSUB###)
_mm_fmsub_ps (VFMSUB###)
_mm_fmsub_sd (VFMSUB###)
_mm_fmsub_ss (VFMSUB###)
_mm_fmsubadd_pd (VFMSUBADD###)
_mm_fmsubadd_ps (VFMSUBADD###)
_mm_fnmadd_pd (VFNMADD###)
_mm_fnmadd_ps (VFNMADD###)
_mm_fnmadd_sd (VFNMADD###)
_mm_fnmadd_ss (VFNMADD###)
_mm_fnmsub_pd (VFNMSUB###)
_mm_fnmsub_ps (VFNMSUB###)
_mm_fnmsub_sd (VFNMSUB###)
_mm_fnmsub_ss (VFNMSUB###)
_mm256_fmadd_pd (VFMADD###)
_mm256_fmadd_ps (VFMADD###)
_mm256_fmadd_sd (VFMADD###)
_mm256_fmadd_ss (VFMADD###)
_mm256_fmaddsub_pd (VFMADDSUB###)
_mm256_fmaddsub_ps (VFMADDSUB###)
_mm256_fmsub_pd (VFMSUB###)
_mm256_fmsub_ps (VFMSUB###)
_mm256_fmsub_sd (VFMSUB###)
_mm256_fmsub_ss (VFMSUB###)
_mm256_fmsubadd_pd (VFMSUBADD###)
_mm256_fmsubadd_ps (VFMSUBADD###)
_mm256_fnmadd_pd (VFNMADD###)
_mm256_fnmadd_ps (VFNMADD###)
_mm256_fnmadd_sd (VFNMADD###)
_mm256_fnmadd_ss (VFNMADD###)
_mm256_fnmsub_pd (VFNMSUB###)
_mm256_fnmsub_ps (VFNMSUB###)
_mm256_fnmsub_sd (VFNMSUB###)
_mm256_fnmsub_ss (VFNMSUB###)
GATHER operations
Intrinsics for GATHER Operations
_mm_i32gather_epi32 (VPGATHERDD)
_mm_i32gather_epi64 (VPGATHERDQ)
_mm_i32gather_pd (VGATHERDPD)
_mm_i64gather_epi32 (VPGATHERQD)
_mm_i64gather_epi64 (VPGATHERQQ)
_mm_i64gather_pd (VGATHERQPD)
_mm_i64gather_ps (VGATHERQPS)
_mm_mask_i32gather_epi32 (VPGATHERDD)
_mm_mask_i32gather_epi64 (VPGATHERDQ)
_mm_mask_i32gather_ps (VGATHERDPS)
_mm_mask_i32gather_ps, _mm256_mask_i32gather_ps
_mm_i32gather_ps, _mm256_i32gather_ps
_mm_mask_i64gather_epi32 (VPGATHERQD)
_mm_mask_i64gather_epi64 (VPGATHERQQ)
_mm_mask_i64gather_pd (VGATHERQPD)
_mm_mask_i64gather_ps (VGATHERQPS)
_mm256_i32gather_epi32 (VPGATHERDD)
_mm256_i32gather_epi64 (VPGATHERDQ)
_mm256_i64gather_epi32 (VPGATHERQD)
_mm256_i64gather_epi64 (VPGATHERQQ)
_mm256_i64gather_pd (VGATHERQPD)
_mm256_i64gather_ps (VGATHERQPS)
_mm256_mask_i32gather_epi32 (VPGATHERDD)
_mm256_mask_i32gather_epi64 (VPGATHERDQ)
_mm256_mask_i32gather_pd (VGATHERDPD)
_mm_mask_i32gather_pd, _mm256_mask_i32gather_pd
_mm_i32gather_pd, _mm256_i32gather_pd
_mm256_mask_i32gather_ps (VGATHERDPS)
_mm_mask_i32gather_ps, _mm256_mask_i32gather_ps
_mm_i32gather_ps, _mm256_i32gather_ps
_mm256_mask_i64gather_epi32 (VPGATHERQD)
_mm256_mask_i64gather_epi64 (VPGATHERQQ)
_mm256_mask_i64gather_pd (VGATHERQPD)
_mm256_mask_i64gather_ps (VGATHERQPS)
insert and extract operations
Intrinsics for Insert/Extract Operations
_mm256_extractepi16
_mm256_extractepi32
_mm256_extractepi64
_mm256_extractepi8
_mm256_extracti128_si256 (VEXTRACTI128)
_mm256_insertepi16
_mm256_insertepi32
_mm256_insertepi64
_mm256_insertepi8
_mm256_inserti128_si256 (VINSERTI128)
load and store operations
logical shift operations
Intrinsics for Logical Shift Operations
_mm256_srl_epi16 (VPSRLW)
_mm256_srli_epi16 (VPSRLW)
_mm_sllv_epi16 (VPSLLVD)
_mm_sllv_epi32 (VPSLLVQ)
_mm_srlv_epi16 (VPSRLVD)
_mm_srlv_epi32 (VPSRLVQ)
_mm256_sll_epi16 (VPSLLW)
_mm256_sll_epi32 (VPSLLD)
_mm256_sll_epi64 (VPSLLQ)
_mm256_slli_epi16 (VPSLLW)
_mm256_slli_epi32 (VPSLLD)
_mm256_slli_epi64 (VPSLLQ)
_mm256_slli_si256 (VPSLLDQ)
_mm256_sllv_epi32 (VPSLLVD)
_mm256_sllv_epi64 (VPSLLVQ)
_mm256_srl_epi32 (VPSRLD)
_mm256_srl_epi64 (VPSRLQ)
_mm256_srli_epi32 (VPSRLD)
_mm256_srli_epi64 (VPSRLQ)
_mm256_srli_si256 (VPSRLDQ)
_mm256_srlv_epi32 (VPSRLVD)
_mm256_srlv_epi64 (VPSRLVQ)
masked load and store operations
_mm_maskload_epi32/64, _mm256_maskload_epi32/64
_mm_maskstore_epi32/64, _mm256_maskstore_epi32/64
_mm256_maskload_epi32 (VPMASKMOVD)
_mm256_maskload_epi64 (VPMASKMOVQ)
_mm256_maskstore_epi32 (VPMASKMOVD)
_mm256_maskstore_epi64 (VPMASKMOVQ)
miscellaneous operations
Intrinsics for Miscellaneous Operations
_mm256_alignr_epi8 (VPALIGNRB)
_mm256_movemask_epi8 (VPMOVMSKB)
_mm256_stream_load_si256 (VMOVNTDQA)
pack and unpack operations
Intrinsics for Pack/Unpack Operations
_mm256_packs_epi16 (VPACKSSWB)
_mm256_packs_epi32 (VPACKSSDW)
_mm256_packus_epi16 (VPACKUSWB)
_mm256_packus_epi32 (VPACKUSDW)
_mm256_unpackhi_epi16 (VPUNPCKHWD)
_mm256_unpackhi_epi32 (VPUNPCKHDQ)
_mm256_unpackhi_epi64 (VPUNPCKHQDQ)
_mm256_unpackhi_epi8 (VPUNPCKHBW)
_mm256_unpacklo_epi16 (VPUNPCKLWD)
_mm256_unpacklo_epi32 (VPUNPCKLDQ)
_mm256_unpacklo_epi64 (VPUNPCKLQDQ)
_mm256_unpacklo_epi8 (VPUNPCKLBW)
packed move operations
Intrinsics for Packed Move with Extend Operations
_mm256_cvtepu8_epi16 (VPMOVZXBW)
_mm256_cvtepi16_epi32 (VPMOVSXWD)
_mm256_cvtepi16_epi64 (VPMOVSXWQ)
_mm256_cvtepi32_epi64 (VPMOVSXDQ)
_mm256_cvtepi8_epi16 (VPMOVSXBW)
_mm256_cvtepi8_epi32 (VPMOVSXBD)
_mm256_cvtepi8_epi64 (VPMOVSXBQ)
_mm256_cvtepu16_epi32 (VPMOVZXWD)
_mm256_cvtepu16_epi64 (VPMOVZXWQ)
_mm256_cvtepu32_epi64 (VPMOVZXDQ)
_mm256_cvtepu8_epi32 (VPMOVZXBD)
_mm256_cvtepu8_epi64 (VPMOVZXBQ)
permute operations
Intrinsics for Permute Operations
_mm256_permute4x64_epi64 (VPERMQ)
_mm256_permute4x64_pd (VPERMPD)
_mm256_permutevar8x32_epi32 (VPERM2I128)
_mm256_permutevar8x32_epi32 (VPERMD)
_mm256_permutevar8x32_epi32 (VPERMPS)
shuffle operations
Intrinsics for Shuffle Operations
_mm256_shuffle_epi32 (VPSHUFD)
_mm256_shuffle_epi8
_mm256_shuffle_epi8 (VPSHUFB)
_mm256_sufflehi_epi16 (VPSHUFHW)
_mm256_sufflelo_epi16 (VPSHUFLW)
Transactional Synchronization Extensions
Intel® SSE
Arithmetic Intrinsics
Cacheability Support Intrinsics
Compare Intrinsics
Conversion Intrinsics
Details about Intel® Streaming SIMD Extensions Intrinsics
Integer Intrinsics
Load Intrinsics
Logical Intrinsics
Macro Function for Matrix Transposition
Macro Functions to Read and Write Control Registers
Macro Function for Shuffle Operations
Miscellaneous Intrinsics
Overview: Intel® Streaming SIMD Extensions (Intel® SSE)
Writing Programs with Intel® Streaming SIMD Extensions (Intel® SSE) Intrinsics
Intrinsics to Read and Write Registers
Set Intrinsics
Store Intrinsics
arithmetic operations
Arithmetic Intrinsics
add_ps
add_ss
div_ps
div_ss
max_ps
max_ss
min_ps
min_ss
mul_ps
mul_ss
rcp_ps
rcp_ss
rsqrt_ps
rsqrt_ss
sqrt_ps
sqrt_ss
sub_ps
sub_ss
cacheability support operations
Cacheability Support Intrinsics
prefetch
sfence
stream_pi
stream_ps
compare operations
Compare Intrinsics
cmpeq_ps
cmpeq_ss
cmpge_ps
cmpge_ss
cmpgt_ps
cmpgt_ss
cmple_ps
cmple_ss
cmplt_ps
cmplt_ss
cmpneq_ps
cmpneq_ss
cmpnge_ps
cmpnge_ss
cmpngt_ps
cmpngt_ss
cmpnle_ps
cmpnle_ss
cmpnlt_ps
cmpnlt_ss
cmpord_ps
cmpord_ss
cmpunord_ps
cmpunord_ss
comieq_ss
comige_ss
comigt_ss
comile_ss
comilt_ss
comineq_ss
ucomieq_ss
ucomige_ss
ucomigt_ss
ucomile_ss
ucomilt_ss
ucomineq_ss
conversion operations
Conversion Intrinsics
cvtpi16_ps
cvtpi32_ps
cvtpi32x2_ps
cvtpi8_ps
cvtps_pi16
cvtps_pi32
cvtps_pi8
cvtpu16_ps
cvtpu8_ps
cvtsi32_ss
cvtsi64_ss
cvtss_f32
cvtss_si32
cvtss_si64
cvttps_pi32
cvttss_si32
cvttss_si64
data types
integer operations
Integer Intrinsics
avg_pu16
avg_pu8
extract_pi16
insert_pi16
maskmove_si641
max_pi16
max_pu8
min_pi16
min_pu8
movemask_pi8
mulhi_pu16
sad_pu8
shuffle_pi16
load operations
Load Intrinsics
load_ps
load_ps1
load_ss
loadh_pi
loadl_pi
loadr_ps(
loadu_ps
logical operations
Logical Intrinsics
and_ps
andnot_ps
or_ps
xor_ps
macros
Macro Function for Matrix Transposition
Macro Functions to Read and Write Control Registers
Macro Function for Shuffle Operations
matrix transposition
read control register
shuffle function
write control register
miscellaneous operations
Miscellaneous Intrinsics
_mm_undefined_ps()
move_ss
movehl_ps
movelh_ps
movemask_ps
shuffle_ps
unpackhi_ps
unpacklo_ps
overview
programming with Intel® SSE intrinsics
read/write register intrinsics
Intrinsics to Read and Write Registers
getcsr
setcsr
registers
set operations
Set Intrinsics
set_ps
set_ps1
set_ss
setr_ps
setzero_ps
store operations
Store Intrinsics
store_ps
store_ps1
store_ss
storeh_pi
storel_pi
storer_ps
storeu_ps
Intel® SSE2
Cacheability Support Intrinsics
Casting Support Intrinsics
Arithmetic Intrinsics
Compare Intrinsics
Conversion Intrinsics
Load Intrinsics
Logical Intrinsics
Set Intrinsics
Store Intrinsics
Arithmetic Intrinsics
Compare Intrinsics
Conversion Intrinsics
Load Intrinsics
Logical Intrinsics
Move Intrinsics
Set Intrinsics
Shift Intrinsics
Store Intrinsics
Miscellaneous Intrinsics
Overview: Intel® Streaming SIMD Extensions 2 (Intel® SSE2)
Pause Intrinsic
Macro Function for Shuffle
cacheability support operations
Cacheability Support Intrinsics
clflush
clflushopt
lfence
mfence
stream_pd
stream_si128
stream_si32
casting support
Casting Support Intrinsics
_mm_castpd_ps
_mm_castpd_si128
_mm_castps_pd
_mm_castps_si128
_mm_castsi128_pd
_mm_castsi128_ps
FP arithmetic operations
Arithmetic Intrinsics
add_pd
add_sd
div_pd
div_sd
max_pd
max_sd
min_pd
min_sd
mul_pd
mul_sd
sqrt_pd
sqrt_sd
sub_pd
sub_sd
FP compare operations
Compare Intrinsics
cmpeq_pd
cmpeq_sd
cmpge_pd
cmpge_sd
cmpgt_pd
cmpgt_sd
cmple_pd
cmple_sd
cmplt_pd
cmplt_sd
cmpneq_pd
cmpneq_sd
cmpnge_pd
cmpnge_sd
cmpngt_pd
cmpngt_sd
cmpnle_pd
cmpnle_sd
cmpnlt_pd
cmpnlt_sd
cmpord_pd
cmpord_sd
cmpunord_pd
cmpunord_sd
comieq_sd
comige_sd
comigt_sd
comile_sd
comilt_sd
comineq_sd
ucomieq_sd
ucomige_sd
ucomigt_sd
ucomile_sd
ucomilt_sd
ucomineq_sd
FP conversion operations
Conversion Intrinsics
cvtepi32_pd
cvtpd_epi32
cvtpd_pi32
cvtpd_ps
cvtpi32_pd
cvtps_pd
cvtsd_f64
cvtsd_si32
cvtsd_ss
cvtsi32_sd
cvtss_sd
cvttpd_epi32
cvttpd_pi32
cvttsd_si32
FP load operations
Load Intrinsics
load_pd
load_sd
load1_pd
loadh_pd
loadl_pd
loadr_pd
loadu_pd
FP logical operations
Logical Intrinsics
and_pd
andnot_pd
or_pd
xor_pd
FP set operations
Set Intrinsics
move_sd
set_pd
set_sd
set1_pd
setr_pd
setzero_pd
FP store operations
Store Intrinsics
store_pd
store_sd
store1_pd
storeh_pd
storel_pd
storer_pd
storeu_pd
integer arithmetic operations
Arithmetic Intrinsics
add_epi16
add_epi32
add_epi64
add_epi8
add_si64
adds_epi16
adds_epi8
adds_epu16
adds_epu8
avg_epu16
avg_epu8
madd_epi16
max_epi16
max_epu8
min_epi16
min_epu8
mul_epu32
mul_su32
mulhi_epi16
mulhi_epu16
mullo_epi16
sad_epu8
sub_epi16
sub_epi32
sub_epi64
sub_epi8
sub_si64
subs_epi16
subs_epi8
subs_epu16
subs_epu8
integer compare operations
Compare Intrinsics
cmpeq_epi16
cmpeq_epi32
cmpeq_epi8
cmpgt_epi16
cmpgt_epi32
cmpgt_epi8
cmplt_epi16
cmplt_epi32
cmplt_epi8
integer conversion operations
Conversion Intrinsics
cvtepi32_ps
cvtps_epi32
cvtsd_si64
cvtsi64_sd
cvttps_epi32
cvttsd_si64
integer load operations
Load Intrinsics
load_si128
loadl_epi64
loadu_si128
integer logical operations
Logical Intrinsics
and_si128
andnot_si128
or_si128
xor_si128
integer move operations
Move Intrinsics
cvtsi128_si32
cvtsi128_si64
cvtsi32_si128
cvtsi64_si128
integer set operations
Set Intrinsics
set_epi16
set_epi32
set_epi64
set_epi8
set1_epi16
set1_epi32
set1_epi64
set1_epi8
setr_epi16
setr_epi32
setr_epi64
setr_epi8
setzero_si128
integer shift operations
Shift Intrinsics
sll_epi16
sll_epi32
sll_epi64
slli_epi16
slli_epi32
slli_epi64
slli_si128
sra_epi16
sra_epi32
srai_epi16
srai_epi32
srl_epi16
srl_epi32
srl_epi64
srli_epi16
srli_epi32
srli_epi64
srli_si128
integer store operations
Store Intrinsics
maskmoveu_si128
store_si128
storel_epi64
storeu_si128
miscellaneous operations
Miscellaneous Intrinsics
extract_epi16
insert_epi16
move_epi64
movemask_epi8
movemask_pd
movepi64_pi64
movpi64_pi64
packs_epi16
packs_epi32
packus_epi16
shuffle_epi32
shuffle_pd
shufflehi_epi16
shufflelo_epi16
unpackhi_epi16
unpackhi_epi32
unpackhi_epi64
unpackhi_epi8
unpackhi_pd
unpacklo_epi16
unpacklo_epi32
unpacklo_epi64
unpacklo_epi8
unpacklo_pd
overview
pause intrinsic
shuffle macro
Intel® SSE3
Single-precision Floating-point Vector Intrinsics
Double-precision Floating-point Vector Intrinsics
Integer Vector Intrinsic
Miscellaneous Intrinsics
Overview: Intel® Streaming SIMD Extensions 3 (Intel® SSE3)
float32 vector intrinsics
Single-precision Floating-point Vector Intrinsics
addsub_ps
hadd_ps
hsub_ps
movehdup_ps
moveldup_ps
float64 vector intrinsics
Double-precision Floating-point Vector Intrinsics
addsub_pd
hadd_pd
hsub_pd
loaddup_pd
movedup_pd
integer vector intrinsic
Integer Vector Intrinsic
lddqu_si128
miscellaneous intrinsics
overview
Intel® SSE4
Application Targeted Accelerators Intrinsics
Cacheability Support Intrinsic
DWORD Multiply Intrinsics
Floating Point Rounding Intrinsics
Floating Point Dot Product Intrinsics
Overview: Intel® Streaming SIMD Extensions 4 (Intel® SSE4)
Packed Blending Intrinsics
Packed Compare for Equal Intrinsic
Packed Compare Intrinsics
Packed DWORD to Unsigned WORD Intrinsic
Packed Format Conversion Intrinsics
Packed Integer Min/Max Intrinsics
Register Insertion/Extraction Intrinsics
Test Intrinsics
application targeted accelerator intrinsics
Application Targeted Accelerators Intrinsics
_mm_crc32_u16
_mm_crc32_u32
_mm_crc32_u64
_mm_crc32_u8
_mm_popcnt_u64
_mm_popcnt_u32
cacheability support intrinsic
Cacheability Support Intrinsic
_mm_stream_load_si128
MOVNTDQA
DWORD multiply operations
DWORD Multiply Intrinsics
_m128i _mm_mul_epi32
_m128i _mm_mullo_epi32
floating-point rounding operations
Floating Point Rounding Intrinsics
_mm_ceil_pd
_mm_ceil_ps
_mm_ceil_sd
_mm_ceil_ss
_mm_floor_pd
_mm_floor_ps
_mm_floor_sd
_mm_floor_ss
_mm_round_pd
_mm_round_ps
_mm_round_sd
_mm_round_ss
FP dot product operations
Floating Point Dot Product Intrinsics
_mm_dp_pd
_mm_dp_ps
overview
packed blending operations
Packed Blending Intrinsics
_mm_blend_epi16
_mm_blend_pd
_mm_blend_ps
_mm_blendv_epi8
_mm_blendv_pd
_mm_blendv_ps
packed compare for equal intrinsic
_mm_cmpeq_epi64
PCMPEQQ
packed compare operations
Packed Compare Intrinsics
_cmpestra
_cmpestrc
_cmpestri
_cmpestrm
_cmpestro
_cmpestrs
_cmpestrz
_cmpistra
_cmpistrc
_cmpistri
_cmpistrm
_cmpistro
_cmpistrs
_cmpistrz
PCMPESTRA
PCMPESTRC
PCMPESTRI
PCMPESTRM
PCMPESTRO
PCMPESTRS
PCMPESTRZ
PCMPISTRA
PCMPISTRC
PCMPISTRI
PCMPISTRM
PCMPISTRO
PCMPISTRS
PCMPISTRZ
packed DWORD to unsigned WORD intrinsic
Packed DWORD to Unsigned WORD Intrinsic
_mm_packus_epi32
PACKUSDW
packed format conversion operations
Packed Format Conversion Intrinsics
_mm_cvtepi16_epi32
_mm_cvtepi16_epi64
_mm_cvtepi32_epi64
_mm_cvtepi8_epi16
_mm_cvtepi8_epi32
_mm_cvtepi8_epi64
_mm_cvtepu16_epi32
_mm_cvtepu16_epi64
_mm_cvtepu32_epi64
_mm_cvtepu8_epi16
_mm_cvtepu8_epi32
_mm_cvtepu8_epi64
PMOVSXBD
PMOVSXBQ
PMOVSXBW
PMOVSXDQ
PMOVSXWD
PMOVSXWQ
PMOVZXBD
PMOVZXBQ
PMOVZXBW
PMOVZXDQ
PMOVZXWD
PMOVZXWQ
packed integer min/max intrinsics
_mm_max_epi16
_mm_max_epi32
_mm_max_epi8
_mm_max_epu32
_mm_min_epi16
_mm_min_epi32
_mm_min_epi8
_mm_min_epu32
PMAXSB
PMAXSD
PMAXUD
PMAXUW
PMINSB
PMINSD
PMINUW
register insertion/extraction operations
Register Insertion/Extraction Intrinsics
_mm_extract_epi16
_mm_extract_epi32
_mm_extract_epi64
_mm_extract_epi8
_mm_extract_ps
_mm_insert_epi32
_mm_insert_epi64
_mm_insert_epi8
_mm_insert_ps
EXTRACTPS
INSERTPS
PEXTRB
PEXTRD
PEXTRQ
PEXTRW
PINSRB
PINSRD
PINSRQ
test operations
Test Intrinsics
_mm_testc_si128
_mm_testnzc_si128
_mm_testz_si128
Intel® Streaming SIMD Extensions
Arithmetic Intrinsics
Intrinsics to Read and Write Registers
arithmetic operations
register intrinsics
Intel® Streaming SIMD Extensions 3
Single-precision Floating-point Vector Intrinsics
Miscellaneous Intrinsics
float32 vector intrinsics
miscellaneous intrinsics
Later Generation Intel® Core™ Processor Instruction Extensions
memory allocation
Allocating and Freeing Aligned Memory Blocks
Overview: Data Alignment, Memory Allocation Intrinsics, and Inline Assembly
MMX™ Technology
Details about MMX™ Technology Intrinsics
data types
registers
MMX™ Technology
Compare Intrinsics (MMX™ technology)
The EMMS Instruction: Why You Need It
EMMS Usage Guidelines
General Support Intrinsics (MMX™ technology)
Logical Intrinsics (MMX™ technology)
Overview: Intrinsics for MMX™ Technology
Packed Arithmetic Intrinsics (MMX™ technology)
Set Intrinsics (MMX™ technology)
Shift Intrinsics (MMX™ technology)
compare operations
Compare Intrinsics (MMX™ technology)
cmpeq_pi16
cmpeq_pi32
cmpeq_pi8
cmpgt_pi16
cmpgt_pi32
cmpgt_pi8
EMMS instruction
The EMMS Instruction: Why You Need It
EMMS Usage Guidelines
about
using
general support operations
General Support Intrinsics (MMX™ technology)
cvtm64_si64
cvtsi32_si64
cvtsi64_m64
cvtsi64_si32
empty
packs_pi16
packs_pi32
packs_pu16
unpackhi_pi16
unpackhi_pi32
unpackhi_pi8
unpacklo_pi16
unpacklo_pi32
unpacklo_pi8
logical operations
Logical Intrinsics (MMX™ technology)
and_si64
andnot_si64
or_si64
xor_si64
overview
packed arithmetic operations
Packed Arithmetic Intrinsics (MMX™ technology)
add_pi16
add_pi32
add_pi8
adds_pi16
adds_pi8
adds_pu16
adds_pu8
madd_pi16
mulhi_pi16
mullo_pi16
sub_pi16
sub_pi32
sub_pi8
subs_pi16
subs_pi8
subs_pu16
subs_pu8
set operations
Set Intrinsics (MMX™ technology)
set_pi16
set_pi32
set_pi8
set1_pi16
set1_pi32
set1_pi8
setr_pi16
setr_pi32
setr_pi8
setzero_si64
shift operations
Shift Intrinsics (MMX™ technology)
sll_pi16
sll_pi32
slli_pi16
slli_pi32
slli_pi64
sra_pi16
sra_pi32
srai_pi16
srai_pi32
srl_pi16
srl_pi32
srl_pi64
srli_pi16
srli_pi32
srli_pi64
naming and syntax
references
registers
SSSE3
Absolute Value Intrinsics
Addition Intrinsics
Concatenate Intrinsics
Multiplication Intrinsics
Negation Intrinsics
Overview: Supplemental Streaming SIMD Extensions 3 (SSSE3)
Shuffle Intrinsics
Subtraction Intrinsics
absolute value operations
Absolute Value Intrinsics
_mm_abs_epi16
_mm_abs_epi32
_mm_abs_epi8
_mm_abs_pi16
_mm_abs_pi32
_mm_abs_pi8
addition operations
Addition Intrinsics
_mm_hadd_epi16
_mm_hadd_epi32
_mm_hadd_pi16
_mm_hadd_pi32
_mm_hadds_epi16
_mm_hadds_pi16
concatenate operations
Concatenate Intrinsics
_mm_alignr_epi8
_mm_alignr_pi8
multiplication operations
Multiplication Intrinsics
_mm_maddubs_epi16
_mm_maddubs_pi16
_mm_mulhrs_epi16
_mm_mulhrs_pi16
negation operations
overview
shuffle operations
subtraction operations
SVML
_mm_cexp_ps, _mm256_cexp_ps
_mm_clog_ps, _mm256_clog_ps
_mm_csqrt_ps, _mm256_csqrt_ps
_mm_cdfnorminv_pd, _mm256_cdfnorminv_pd
_mm_cdfnorminv_ps, _mm256_cdfnorminv_ps
_mm_erf_pd, _mm256_erf_pd
_mm_erf_ps, _mm256_erf_ps
_mm_erfc_pd, _mm256_erfc_pd
_mm_erfc_ps, _mm256_erfc_ps
_mm_erfinv_pd, _mm256_erfinv_pd
_mm_erfinv_ps, _mm256_erfinv_ps
_mm_exp2_pd, _mm256_exp2_pd
_mm_exp2_ps, _mm256_exp2_ps
_mm_hypot_ps, _mm256_hypot_ps
_mm_exp_pd, _mm256_exp_pd
_mm_exp_ps, _mm256_exp_ps
_mm_exp10_pd, _mm256_exp10_pd
_mm_exp10_ps, _mm256_exp10_ps
_mm_expm1_pd, _mm256_expm1_pd
_mm_expm1_ps, _mm256_expm1_ps
_mm_hypot_pd, _mm256_hypot_pd
_mm_pow_pd, _mm256_pow_pd
_mm_pow_ps, _mm256_pow_ps
_mm_log_pd, _mm256_log_pd
_mm_log_ps, _mm256_log_ps
_mm_log10_pd, _mm256_log10_pd
_mm_log10_ps, _mm256_log10_ps
_mm_log1p_pd, _mm256_log1p_pd
_mm_log1p_ps, _mm256_log1p_ps
_mm_log2_pd, _mm256_log2_pd
_mm_log2_ps, _mm256_log2_ps
_mm_logb_pd, _mm256_logb_pd
_mm_logb_ps, _mm256_logb_ps
Overview: Intrinsics for Short Vector Math Library (SVML) Functions
_mm_sqrt_ps, _mm256_sqrt_ps
_mm_cbrt_pd, _mm256_cbrt_pd
_mm_cbrt_ps, _mm256_cbrt_ps
_mm_invcbrt_pd, _mm256_invcbrt_pd
_mm_invcbrt_ps, _mm256_invcbrt_ps
_mm_invsqrt_pd, _mm256_invsqrt_pd
_mm_invsqrt_ps, _mm256_invsqrt_ps
_mm_sqrt_pd, _mm256_sqrt_pd
_mm_sinh_ps, _mm256_sinh_ps
_mm_acos_pd, _mm256_acos_pd
_mm_acos_ps, _mm256_acos_ps
_mm_acosh_pd, _mm256_acosh_pd
_mm_acosh_ps, _mm256_acosh_ps
_mm_asin_pd, _mm256_asin_pd
_mm_asin_ps, _mm256_asin_ps
_mm_asinh_pd, _mm256_asinh_pd
_mm_asinh_ps, _mm256_asinh_ps
_mm_atan_pd, _mm256_atan_pd
_mm_atan_ps, _mm256_atan_ps
_mm_atan2_pd, _mm256_atan2_pd
_mm_atan2_ps, _mm256_atan2_ps
_mm_atanh_pd, _mm256_atanh_pd
_mm_atanh_ps, _mm256_atanh_ps
_mm_cos_pd, _mm256_cos_pd
_mm_cos_ps, _mm256_cos_ps
_mm_cosd_pd, _mm256_cosd_pd
_mm_cosd_ps, _mm256_cosd_ps
_mm_cosh_pd, _mm256_cosh_pd
_mm_cosh_ps, _mm256_cosh_ps
_mm_sin_pd, _mm256_sin_pd
_mm_sin_ps, _mm256_sin_ps
_mm_sincos_pd, _mm256_sincos_pd
_mm_sincos_ps, _mm256_sincos_ps
_mm_sind_pd, _mm256_sind_pd
_mm_sind_ps, _mm256_sind_ps
_mm_sinh_pd, _mm256_sinh_pd
_mm_tan_pd, _mm256_tan_pd
_mm_tan_ps, _mm256_tan_ps
_mm_tand_pd, _mm256_tand_pd
_mm_tand_ps, _mm256_tand_ps
_mm_tanh_pd, _mm256_tanh_pd
_mm_tanh_ps, _mm256_tanh_ps
complex functions
_mm_cexp_ps, _mm256_cexp_ps
_mm_clog_ps, _mm256_clog_ps
_mm_csqrt_ps, _mm256_csqrt_ps
_mm_cexp_ps, _mm256_cexp_ps
_mm_clog_ps, _mm256_clog_ps
_mm_csqrt_ps, _mm256_csqrt_ps
error functions
_mm_cdfnorminv_pd, _mm256_cdfnorminv_pd
_mm_cdfnorminv_ps, _mm256_cdfnorminv_ps
_mm_erf_pd, _mm256_erf_pd
_mm_erf_ps, _mm256_erf_ps
_mm_erfc_pd, _mm256_erfc_pd
_mm_erfc_ps, _mm256_erfc_ps
_mm_erfinv_pd, _mm256_erfinv_pd
_mm_erfinv_ps, _mm256_erfinv_ps
_mm_cdfnorminv_pd, _mm256_cdfnorminv_pd
_mm_cdfnorminv_ps, _mm256_cdfnorminv_ps
_mm_erf_pd, _mm256_erf_pd
_mm_erf_ps, _mm256_erf_ps
_mm_erfc_pd, _mm256_erfc_pd
_mm_erfc_ps, _mm256_erfc_ps
_mm_erfinv_pd, _mm256_erfinv_pd
_mm_erfinv_ps, _mm256_erfinv_ps
exponential functions
_mm_exp2_pd, _mm256_exp2_pd
_mm_exp2_ps, _mm256_exp2_ps
_mm_hypot_ps, _mm256_hypot_ps
_mm_exp_pd, _mm256_exp_pd
_mm_exp_ps, _mm256_exp_ps
_mm_exp10_pd, _mm256_exp10_pd
_mm_exp10_ps, _mm256_exp10_ps
_mm_expm1_pd, _mm256_expm1_pd
_mm_expm1_ps, _mm256_expm1_ps
_mm_hypot_pd, _mm256_hypot_pd
_mm_pow_pd, _mm256_pow_pd
_mm_pow_ps, _mm256_pow_ps
_mm_exp2_pd, _mm256_exp2_pd
_mm_exp2_ps, _mm256_exp2_ps
_mm_hypot_ps, _mm256_hypot_ps
_mm_exp_pd, _mm256_exp_pd
_mm_exp_ps, _mm256_exp_ps
_mm_exp10_pd, _mm256_exp10_pd
_mm_exp10_ps, _mm256_exp10_ps
_mm_expm1_pd, _mm256_expm1_pd
_mm_expm1_ps, _mm256_expm1_ps
_mm_hypot_pd, _mm256_hypot_pd
_mm_pow_pd, _mm256_pow_pd
_mm_pow_ps, _mm256_pow_ps
logarithmic functions
_mm_log_pd, _mm256_log_pd
_mm_log_ps, _mm256_log_ps
_mm_log10_pd, _mm256_log10_pd
_mm_log10_ps, _mm256_log10_ps
_mm_log1p_pd, _mm256_log1p_pd
_mm_log1p_ps, _mm256_log1p_ps
_mm_log2_pd, _mm256_log2_pd
_mm_log2_ps, _mm256_log2_ps
_mm_logb_pd, _mm256_logb_pd
_mm_logb_ps, _mm256_logb_ps
_mm_log_pd, _mm256_log_pd
_mm_log_ps, _mm256_log_ps
_mm_log10_pd, _mm256_log10_pd
_mm_log10_ps, _mm256_log10_ps
_mm_log1p_pd, _mm256_log1p_pd
_mm_log1p_ps, _mm256_log1p_ps
_mm_log2_pd, _mm256_log2_pd
_mm_log2_ps, _mm256_log2_ps
_mm_logb_pd, _mm256_logb_pd
_mm_logb_ps, _mm256_logb_ps
overview
square and cube root functions
_mm_sqrt_ps, _mm256_sqrt_ps
_mm_cbrt_pd, _mm256_cbrt_pd
_mm_cbrt_ps, _mm256_cbrt_ps
_mm_invcbrt_pd, _mm256_invcbrt_pd
_mm_invcbrt_ps, _mm256_invcbrt_ps
_mm_invsqrt_pd, _mm256_invsqrt_pd
_mm_invsqrt_ps, _mm256_invsqrt_ps
_mm_sqrt_pd, _mm256_sqrt_pd
_mm_sqrt_ps, _mm256_sqrt_ps
_mm_cbrt_pd, _mm256_cbrt_pd
_mm_cbrt_ps, _mm256_cbrt_ps
_mm_invcbrt_pd, _mm256_invcbrt_pd
_mm_invcbrt_ps, _mm256_invcbrt_ps
_mm_invsqrt_pd, _mm256_invsqrt_pd
_mm_invsqrt_ps, _mm256_invsqrt_ps
_mm_sinh_pd, _mm256_sinh_pd
trigonometric functions
_mm_sinh_ps, _mm256_sinh_ps
_mm_acos_pd, _mm256_acos_pd
_mm_acos_ps, _mm256_acos_ps
_mm_acosh_pd, _mm256_acosh_pd
_mm_acosh_ps, _mm256_acosh_ps
_mm_asin_pd, _mm256_asin_pd
_mm_asin_ps, _mm256_asin_ps
_mm_asinh_pd, _mm256_asinh_pd
_mm_asinh_ps, _mm256_asinh_ps
_mm_atan_pd, _mm256_atan_pd
_mm_atan_ps, _mm256_atan_ps
_mm_atan2_pd, _mm256_atan2_pd
_mm_atan2_ps, _mm256_atan2_ps
_mm_atanh_pd, _mm256_atanh_pd
_mm_atanh_ps, _mm256_atanh_ps
_mm_cos_pd, _mm256_cos_pd
_mm_cos_ps, _mm256_cos_ps
_mm_cosd_pd, _mm256_cosd_pd
_mm_cosd_ps, _mm256_cosd_ps
_mm_cosh_pd, _mm256_cosh_pd
_mm_cosh_ps, _mm256_cosh_ps
_mm_sin_pd, _mm256_sin_pd
_mm_sin_ps, _mm256_sin_ps
_mm_sincos_pd, _mm256_sincos_pd
_mm_sincos_ps, _mm256_sincos_ps
_mm_sind_pd, _mm256_sind_pd
_mm_sind_ps, _mm256_sind_ps
_mm_sinh_pd, _mm256_sinh_pd
_mm_tan_pd, _mm256_tan_pd
_mm_tan_ps, _mm256_tan_ps
_mm_tand_pd, _mm256_tand_pd
_mm_tand_ps, _mm256_tand_ps
_mm_tanh_pd, _mm256_tanh_pd
_mm_tanh_ps, _mm256_tanh_ps
_mm_sinh_ps, _mm256_sinh_ps
_mm_acos_pd, _mm256_acos_pd
_mm_acos_ps, _mm256_acos_ps
_mm_acosh_pd, _mm256_acosh_pd
_mm_acosh_ps, _mm256_acosh_ps
_mm_asin_pd, _mm256_asin_pd
_mm_asin_ps, _mm256_asin_ps
_mm_asinh_pd, _mm256_asinh_pd
_mm_asinh_ps, _mm256_asinh_ps
_mm_atan_pd, _mm256_atan_pd
_mm_atan_ps, _mm256_atan_ps
_mm_atan2_pd, _mm256_atan2_pd
_mm_atan2_ps, _mm256_atan2_ps
_mm_atanh_pd, _mm256_atanh_pd
_mm_atanh_ps, _mm256_atanh_ps
_mm_cos_pd, _mm256_cos_pd
_mm_cos_ps, _mm256_cos_ps
_mm_cosd_pd, _mm256_cosd_pd
_mm_cosd_ps, _mm256_cosd_ps
_mm_cosh_pd, _mm256_cosh_pd
_mm_cosh_ps, _mm256_cosh_ps
_mm_sin_pd, _mm256_sin_pd
_mm_sin_ps, _mm256_sin_ps
_mm_sincos_pd, _mm256_sincos_pd
_mm_sincos_ps, _mm256_sincos_ps
_mm_sind_pd, _mm256_sind_pd
_mm_sind_ps, _mm256_sind_ps
_mm_sinh_pd, _mm256_sinh_pd
_mm_tan_pd, _mm256_tan_pd
_mm_tan_ps, _mm256_tan_ps
_mm_tand_pd, _mm256_tand_pd
_mm_tand_ps, _mm256_tand_ps
_mm_tanh_pd, _mm256_tanh_pd
_mm_tanh_ps, _mm256_tanh_ps
intrinsics
Macro Functions
Intel® SSE2
Macro Functions
macro functions
Intel® SSE3
Macro Functions
macro functions
Intrinsics for Intel® Advanced Vector Extensions 512 (Intel® AVX-512)
Intrinsics for Integer Comparison Operations
Intrinsics for FP Comparison Operations
absolute value operations
Intrinsics for Absolute Value Operations
_mm512[_mask[z]]_abs_epi32
_mm512[_mask[z]]_abs_epi64
architectural enhancements
arithmetic operations
Intrinsics for FP Multiplication Operations
Intrinsics for FP Subtraction Operations
Intrinsics for Integer Addition Operations
Intrinsics for Integer Multiplication Operations
Intrinsics for Integer Subtraction Operations
FP multiplicationoperations
Intrinsics for FP Multiplication Operations
_mm_mask_mul_round_sd
_mm_mask_mul_round_ss
_mm_maskz_mul_round_sd
_mm_maskz_mul_round_ss
_mm_mul_round_sd
_mm_mul_round_ss
_mm512_maskz_mul_round_pd
_mm512_maskz_mul_round_ps
FP subtraction operations
Intrinsics for FP Subtraction Operations
_mm_mask[z]_sub_sd
_mm_mask[z]_sub_ss
_mm512[_mask[z]]_sub_pd
_mm512[_mask[z]]_sub_ps
_mm512[_mask[z]]_sub_round_pd
_mm512[_mask[z]]_sub_round_ps
_mm512[_mask[z]]_sub_round_sd
_mm512[_mask[z]]_sub_round_ss
integer addition
Intrinsics for Integer Addition Operations
_mm512[_mask[z]]_add_epi32
_mm512[_mask[z]]_add_epi64
integer multiplication operations
Intrinsics for Integer Multiplication Operations
_mm512[_mask[z]]_mul_epi32
_mm512[_mask[z]]_mul_epu32
_mm512[_mask]_mulhi_epi32
_mm512[_mask]_mulhi_epu32
_mm512[_mask]_mullo_epi32
_mm512[_mask]_mullox_epi64
integer subtraction operations
Intrinsics for Integer Subtraction Operations
_mm512[_mask[z]]_sub_round_epi32
_mm512[_mask[z]]_sub_round_epi64
arithmetic operations
Intrinsics for FP Addition Operations
FP addition operations
Intrinsics for FP Addition Operations
_mm_mask[z]_add_sd
_mm_mask[z]_add_ss
_mm[_mask[z]]_add_round_sd
_mm[_mask[z]]_add_round_ss
_mm512_mask[z]_add_pd
_mm512[_mask[z]]_add_ps
_mm512[_mask[z]]_add_round_pd
_mm512[_mask[z]]_add_round_ps
bit manipulation operations
Intrinsics for Integer Bit Manipulation and Conflict Detection Operations
_mm512_lzcnt_epi32
_mm512_lzcnt_epi64
_mm512_mask_lzcnt_epi32
_mm512_mask_lzcnt_epi64
_mm512_maskz_lzcnt_epi32
_mm512_maskz_lzcnt_epi64
bit rotation operations
Intrinsics for Integer Bit Rotation Operations
_mm512[_mask[z]]_rol_epi32
_mm512[_mask[z]]_rol_epi64
_mm512[_mask[z]]_rolv_epi32
_mm512[_mask[z]]_rolv_epi64
_mm512[_mask[z]]_ror_epi32
_mm512[_mask[z]]_ror_epi64
_mm512[_mask[z]]_rorv_epi32
_mm512[_mask[z]]_rorv_epi64
bitwise logical operations
Intrinsics for Bitwise Logical Operations
_mm512[_mask[z]]_and_epi32
_mm512[_mask[z]]_and_epi64
_mm512[_mask[z]]_andnot_epi32
_mm512[_mask[z]]_andnot_epi64
_mm512[_mask[z]]_or_epi32
_mm512[_mask[z]]_or_epi64
_mm512[_mask[z]]_xor_epi32
_mm512[_mask[z]]_xor_epi64
blend operations
Intrinsics for Blend Operations
_mm512_mask_blend_epi32
_mm512_mask_blend_epi64
_mm512_mask_blend_pd
_mm512_mask_blend_ps
conflict detection operations
Intrinsics for Test Operations
_mm512[_mask[z]]_conflict_epi32
_mm512[_mask[z]]_conflict_epi64
data types
extract operations
Intrinsics for FP Insert and Extract Operations
Intrinsics for Integer Insert and Extract Operations
_mm512[_mask[z]]_extractf32x4_ps
_mm512[_mask[z]]_extractf64x4_pd
_mm512[_mask[z]]_extracti32x4_epi32
Intrinsics for FP Insert and Extract Operations
Intrinsics for Integer Insert and Extract Operations
_mm512[_mask[z]]_extracti64x4_epi64
Intrinsics for FP Insert and Extract Operations
Intrinsics for Integer Insert and Extract Operations
FP broadcast operations
Intrinsics for FP Broadcast Operations
_mm512[_mask[z]]_broadcast_f32x4
_mm512[_mask[z]]_broadcast_f64x4
_mm512[_mask[z]]_broadcastsd_pd
_mm512[_mask[z]]_broadcastss_ps
FP conversion operations
Intrinsics for FP Conversion Operations
_mm_cvt_roundsd_i32
_mm_cvt_roundsd_i64
_mm_cvt_roundsd_u32
_mm_cvt_roundsd_u64
_mm_cvt_roundss_i32
_mm_cvt_roundss_i64
_mm_cvt_roundss_u32
_mm_cvt_roundss_u64
_mm_cvtt_roundsd_i32
_mm_cvtt_roundsd_i64
_mm_cvtt_roundsd_u32
_mm_cvtt_roundsd_u64
_mm_cvtt_roundss_i32
_mm_cvtt_roundss_i64
_mm_cvtt_roundss_u32
_mm_cvtt_roundss_u64
_mm512[_mask[z]]_cvt_roundpd_epi32
_mm512[_mask[z]]_cvt_roundpd_epu32
_mm512[_mask[z]]_cvt_roundpd_ps
_mm512[_mask[z]]_cvt_roundph_ps
_mm512[_mask[z]]_cvt_roundps_epi32
_mm512[_mask[z]]_cvt_roundps_epu32
_mm512[_mask[z]]_cvt_roundps_pd
_mm512[_mask[z]]_cvt_roundps_ph
_mm512[_mask[z]]_cvt_roundsd_ss
_mm512[_mask[z]]_cvt_roundss_sd
_mm512[_mask[z]]_cvtt_roundpd_epi32
_mm512[_mask[z]]_cvtt_roundpd_epu32
_mm512[_mask[z]]_cvtt_roundps_epi32
_mm512[_mask[z]]_cvtt_roundps_epu32
FP division operations
Intrinsics for FP Division Operations
_mm[_mask[z]]_div_round_sd
_mm[_mask[z]]_div_round_ss
_mm512[_mask[z]]_div_round_pd
FP expand and load operations
Intrinsics for FP Expand and Load Operations
_mm512_mask[z]_expandloadu_pd
_mm512_mask[z]_expandloadu_ps
_mm512[_mask[z]]_expand_pd
_mm512[_mask[z]]_expand_ps
FP Fused Multiply-Add (FMA) operations
Intrinsics for FP Fused Multiply-Add (FMA) Operations
_mm512_mask[3][z]_fmadd_round_sd
_mm512_mask[3][z]_fmadd_round_ss
_mm512_mask[3][z]_fmadd_sd
_mm512_mask[3][z]_fmadd_ss
_mm512_mask[3][z]_fnmadd_round_sd
_mm512_mask[3][z]_fnmadd_round_ss
_mm512_mask[3][z]_fnmadd_sd
_mm512_mask[3][z]_fnmadd_ss
_mm512[_mask[3][z]]_fmadd_pd
_mm512[_mask[3][z]]_fmadd_ps
_mm512[_mask[3][z]]_fmadd_round_pd
_mm512[_mask[3][z]]_fmadd_round_ps
_mm512[_mask[3][z]]_fmaddsub_pd
_mm512[_mask[3][z]]_fmaddsub_ps
_mm512[_mask[3][z]]_fmaddsub_round_pd
_mm512[_mask[3][z]]_fmaddsub_round_ps
_mm512[_mask[3][z]]_fmaddsub_round_sd
_mm512[_mask[3][z]]_fmaddsub_round_ss
_mm512[_mask[3][z]]_fmaddsub_sd
_mm512[_mask[3][z]]_fmaddsub_ss
_mm512[_mask[3][z]]_fmsub_pd
_mm512[_mask[3][z]]_fmsub_ps
_mm512[_mask[3][z]]_fmsub_round_pd
_mm512[_mask[3][z]]_fmsub_round_ps
_mm512[_mask[3][z]]_fmsub_round_sd
_mm512[_mask[3][z]]_fmsub_round_ss
_mm512[_mask[3][z]]_fmsub_sd
_mm512[_mask[3][z]]_fmsub_ss
_mm512[_mask[3][z]]_fnmadd_pd
_mm512[_mask[3][z]]_fnmadd_ps
_mm512[_mask[3][z]]_fnmadd_round_pd
_mm512[_mask[3][z]]_fnmadd_round_ps
_mm512[_mask[3][z]]_fnmaddsub_pd
_mm512[_mask[3][z]]_fnmaddsub_ps
_mm512[_mask[3][z]]_fnmaddsub_round_pd
_mm512[_mask[3][z]]_fnmaddsub_round_ps
_mm512[_mask[3][z]]_fnmaddsub_round_sd
_mm512[_mask[3][z]]_fnmaddsub_round_ss
_mm512[_mask[3][z]]_fnmaddsub_sd
_mm512[_mask[3][z]]_fnmaddsub_ss
_mm512[_mask[3][z]]_fnmsub_pd
_mm512[_mask[3][z]]_fnmsub_ps
_mm512[_mask[3][z]]_fnmsub_round_pd
_mm512[_mask[3][z]]_fnmsub_round_ps
_mm512[_mask[3][z]]_fnmsub_round_sd
_mm512[_mask[3][z]]_fnmsub_round_ss
_mm512[_mask[3][z]]_fnmsub_sd
_mm512[_mask[3][z]]_fnmsub_ss
FP gather and scatter operations
FP Loads and store operations
FP move operations
Intrinsics for FP Move Operations
_mm512_mask_mov_pd
_mm512_mask_mov_ps
_mm512_mask_move_sd
_mm512_mask_move_ss
_mm512_mask_movedup_pd
_mm512_mask_movehdup_ps
_mm512_mask_moveldup_ps
_mm512_maskz_mov_pd
_mm512_maskz_mov_ps
_mm512_maskz_move_sd
_mm512_maskz_move_ss
_mm512_maskz_movedup_pd
_mm512_maskz_movehdup_ps
_mm512_maskz_moveldup_ps
_mm512_movedup_pd
_mm512_movehdup_ps
_mm512_moveldup_ps
FP permute operations
Intrinsics for FP Permutation Operations
_mm512[_mask[2][z]]_permutex2var_pd
_mm512[_mask[2][z]]_permutex2var_ps
_mm512[_mask[z]]_permute_pd
_mm512[_mask[z]]_permute_ps
_mm512[_mask[z]]_permutevar_pd
_mm512[_mask[z]]_permutevar_ps
_mm512[_mask[z]]_permutex_pd
_mm512[_mask[z]]_permutexvar_pd
_mm512[_mask[z]]_permutexvar_ps
_mm512[_mask]_permute4f128_ps
FP reduction operations
Intrinsics for FP Reduction Operations
_mm512[_mask]_reduce_add_pd
_mm512[_mask]_reduce_add_ps
_mm512[_mask]_reduce_max_pd
_mm512[_mask]_reduce_max_ps
_mm512[_mask]_reduce_min_pd
_mm512[_mask]_reduce_min_ps
_mm512[_mask]_reduce_mul_pd
_mm512[_mask]_reduce_mul_ps
FP shuffle operations
Intrinsics for FP Shuffle Operations
_mm512[_mask[z]]_shuffle_f32x4
_mm512[_mask[z]]_shuffle_f64x2
_mm512[_mask[z]]_shuffle_pd
_mm512[_mask[z]]_shuffle_ps
FP unpack operations
Intrinsics for FP Pack and Unpack Operations
_mm512_mask_unpackhi_pd
_mm512_mask_unpackhi_ps
_mm512_mask_unpacklo_pd
_mm512_mask_unpacklo_ps
_mm512_maskz_unpackhi_pd
_mm512_maskz_unpackhi_ps
_mm512_maskz_unpacklo_pd
_mm512_maskz_unpacklo_ps
_mm512_unpackhi_pd
_mm512_unpackhi_ps
_mm512_unpacklo_pd
_mm512_unpacklo_ps
insert operations
Intrinsics for FP Insert and Extract Operations
Intrinsics for Integer Insert and Extract Operations
_mm_extract_ps
_mm_insert_ps
_mm256_insertf128_pd
_mm256_insertf128_ps
_mm256_insertf128_si256
Intrinsics for FP Insert and Extract Operations
Intrinsics for Integer Insert and Extract Operations
_mm512[_mask[z]]_insertf32x4
_mm512[_mask[z]]_insertf64x4
Intrinsics for FP Insert and Extract Operations
Intrinsics for Integer Insert and Extract Operations
_mm512[_mask[z]]_inserti64x4
integer bit shift operations
Intrinsics for Integer Bit Shift Operations
_mm512[_mask[z]]_sll_epi32
_mm512[_mask[z]]_sll_epi64
_mm512[_mask[z]]_slli_epi64
_mm512[_mask[z]]_sllv_epi64
_mm512[_mask[z]]_sra_epi32
_mm512[_mask[z]]_sra_epi64
_mm512[_mask[z]]_srai_epi64
_mm512[_mask[z]]_srav_epi64
_mm512[_mask[z]]_srl_epi32
_mm512[_mask[z]]_srl_epi64
_mm512[_mask[z]]_srli_epi32
_mm512[_mask[z]]_srli_epi64
_mm512[_mask[z]]_srlv_epi32
_mm512[_mask[z]]_srlv_epi64
integer broadcast operations
Intrinsics for Integer Broadcast Operations
_mm512_broadcastmb_epi64
_mm512_broadcastmw_epi32
_mm512[_mask[z]]_broadcast_i32x4
_mm512[_mask[z]]_broadcast_i64x4
_mm512[_mask[z]]_broadcastd_epi32
_mm512[_mask[z]]_broadcastq_epi64
integer compression operations
integer conversion operations
Intrinsics for Integer Conversion Operations
_mm_cvt_roundi32_ss
_mm_cvt_roundi64_sd
_mm_cvt_roundi64_ss
_mm_cvt_roundu32_ss
_mm_cvt_roundu64_sd
_mm_cvt_roundu64_ss
_mm_cvtu32_sd
_mm512_cvtsi512_si32
_mm512[_mask[z]]_cvt_roundepi32_ps
_mm512[_mask[z]]_cvt_roundepu32_ps
_mm512[_mask[z]]_cvtepi16_epi32
_mm512[_mask[z]]_cvtepi16_epi64
_mm512[_mask[z]]_cvtepi32_epi16
_mm512[_mask[z]]_cvtepi32_epi64
_mm512[_mask[z]]_cvtepi32_epi8
_mm512[_mask[z]]_cvtepi32_pd
_mm512[_mask[z]]_cvtepi64_epi16
_mm512[_mask[z]]_cvtepi64_epi32
_mm512[_mask[z]]_cvtepi64_epi8
_mm512[_mask[z]]_cvtepi8_epi32
_mm512[_mask[z]]_cvtepi8_epi64
_mm512[_mask[z]]_cvtepu16_epi32
_mm512[_mask[z]]_cvtepu32_epi64
_mm512[_mask[z]]_cvtepu32_pd
_mm512[_mask[z]]_cvtepu8_epi64
_mm512[_mask[z]]_cvtsepi32_epi16
_mm512[_mask[z]]_cvtsepi32_epi8
_mm512[_mask[z]]_cvtsepi64_epi16
_mm512[_mask[z]]_cvtsepi64_epi32
_mm512[_mask[z]]_cvtsepi64_epi8
_mm512[_mask[z]]_cvtusepi32_epi16
_mm512[_mask[z]]_cvtusepi32_epi8
_mm512[_mask[z]]_cvtusepi64_epi16
_mm512[_mask[z]]_cvtusepi64_epi32
_mm512[_mask[z]]_cvtusepi64_epi8
integer gather and scatter operations
integer move operations
Intrinsics for Integer Move Operations
_mm512_mask[z]_mov_epi32
_mm512_mask[z]_mov_epi64
integer permute operations
Intrinsics for Integer Permutation Operations
_mm512[_mask[2][z]]_permutex2var_epi32
_mm512[_mask[2][z]]_permutex2var_epi64
_mm512[_mask[z]]_permutex_epi64
_mm512[_mask[z]]_permutexvar_epi32
integer reduction operations
Intrinsics for Integer Reduction Operations
_mm512[_mask]_reduce_add_epi64
_mm512[_mask]_reduce_and_epi64
_mm512[_mask]_reduce_max_epi64
_mm512[_mask]_reduce_max_epu64
_mm512[_mask]_reduce_min_epi64
_mm512[_mask]_reduce_min_epu64
_mm512[_mask]_reduce_mul_epi64
_mm512[_mask]_reduce_or_epi64
integer shuffle operations
Intrinsics for Integer Shuffle Operations
_mm512[_mask[z]]_shuffle_epi32
_mm512[_mask[z]]_shuffle_f32x4
_mm512[_mask[z]]_shuffle_f64x2
_mm512[_mask[z]]_shuffle_i32x4
_mm512[_mask[z]]_shuffle_i64x2
_mm512[_mask[z]]_shuffle_pd
_mm512[_mask[z]]_shuffle_ps
load and store operations
mathematics operations
minimum and maximum FP operations
Intrinsics for Determining Minimum and Maximum FP Values
_mm[_mask[z]]_max_round_sd
_mm[_mask[z]]_max_round_ss
_mm[_mask[z]]_min_round_sd
_mm[_mask[z]]_min_round_ss
_mm512[_mask[z]]_max_round_pd
_mm512[_mask[z]]_max_round_ps
_mm512[_mask[z]]_min_round_pd
_mm512[_mask[z]]_min_round_ps
minimum and maximum integer operations
Intrinsics for Determining Minimum and Maximum Integer Values
_mm512[_mask[z]]_max_epi32
_mm512[_mask[z]]_max_epi64
_mm512[_mask[z]]_max_epu32
_mm512[_mask[z]]_max_epu64
_mm512[_mask[z]]_min_epi32
_mm512[_mask[z]]_min_epi64
_mm512[_mask[z]]_min_epu32
_mm512[_mask[z]]_min_epu64
miscellaneous FP operations
miscellaneous integer operations
Intrinsics for Miscellaneous Integer Operations
_mm512[_mask[z]]_alignr_epi32
_mm512[_mask[z]]_alignr_epi64
overview
registers
scale operations
Intrinsics for Scale Operations
_mm512_mask_scalef_round_pd
_mm512_mask_scalef_round_ps
_mm512_mask_scalef_round_sd
_mm512_mask_scalef_round_ss
_mm512_maskz_scalef_round_pd
_mm512_maskz_scalef_round_ps
_mm512_maskz_scalef_round_sd
_mm512_maskz_scalef_round_ss
_mm512_scalef_round_pd
_mm512_scalef_round_ps
_mm512_scalef_round_sd
_mm512_scalef_round_ss
set operations
Intrinsics for Set Operations
_mm512_undefined
_mm512_undefined_epi32
_mm512_undefined_pd
_mm512_undefined_ps
SVML
Intrinsics for Short Vector Math Library (SVML) Operations
Intrinsics for Error Function Operations (512-bit)
Intrinsics for Logarithmic Operations (512-bit)
Intrinsics for Root Function Operations (512-bit)
division operations
Intrinsics for Short Vector Math Library (SVML) Operations
_mm[_mask[z]]_div_round_sd
_mm[_mask[z]]_div_round_ss
_mm512[_mask[z]]_div_round_pd
error function operations
Intrinsics for Short Vector Math Library (SVML) Operations
Intrinsics for Error Function Operations (512-bit)
exponential operations
logarithmic operations
logarithmic operations
Intrinsics for Logarithmic Operations (512-bit)
_mm512[_mask]_log_pd
_mm512[_mask]_log_ps
_mm512[_mask]_log10_pd
_mm512[_mask]_log10_ps
_mm512[_mask]_log1p_pd
_mm512[_mask]_log1p_ps
_mm512[_mask]_log2_pd
_mm512[_mask]_log2_ps
_mm512[_mask]_logb_pd
_mm512[_mask]_logb_ps
reciprocal operations
remainder operations
root and cube root operations
root operations
Intrinsics for Root Function Operations (512-bit)
_mm512[_mask]_cbrt_pd
_mm512[_mask]_cbrt_ps
_mm512[_mask]_hypot_pd
_mm512[_mask]_hypot_ps
_mm512[_mask]_invsqrt_pd
_mm512[_mask]_invsqrt_ps
_mm512[_mask]_sqrt_pd
_mm512[_mask]_sqrt_ps
rounding operations
Intrinsics for Short Vector Math Library (SVML) Operations
_mm512[_mask]__pd
_mm512[_mask]__ps
trigonometric operations
Intrinsics for Short Vector Math Library (SVML) Operations
_mm512[_mask]_acos_pd
_mm512[_mask]_acos_ps
_mm512[_mask]_acosh_pd
_mm512[_mask]_acosh_ps
_mm512[_mask]_asin_pd
_mm512[_mask]_asin_ps
_mm512[_mask]_asinh_pd
_mm512[_mask]_asinh_ps
_mm512[_mask]_atan_pd
_mm512[_mask]_atan_ps
_mm512[_mask]_atan2_pd
_mm512[_mask]_atan2_ps
_mm512[_mask]_atanh_pd
_mm512[_mask]_atanh_ps
_mm512[_mask]_cos_pd
_mm512[_mask]_cos_ps
_mm512[_mask]_cosd_pd
_mm512[_mask]_cosd_ps
_mm512[_mask]_cosh_pd
_mm512[_mask]_cosh_ps
_mm512[_mask]_sin_pd
_mm512[_mask]_sin_ps
_mm512[_mask]_sind_pd
_mm512[_mask]_sind_ps
_mm512[_mask]_sinh_pd
_mm512[_mask]_sinh_ps
_mm512[_mask]_tan_pd
_mm512[_mask]_tan_ps
_mm512[_mask]_tand_pd
_mm512[_mask]_tand_ps
_mm512[_mask]_tanh_pd
_mm512[_mask]_tanh_ps
SVML operations
Intrinsics for Exponential Operations (512-bit)
Intrinsics for Reciprocal Operations (512-bit)
exponential operations
Intrinsics for Exponential Operations (512-bit)
_mm512[_mask[z]]_exp2a23_pd
_mm512[_mask[z]]_mm512_exp2a23_ps
_mm512[_mask[z]]_mm512_exp2a23_round_pd
_mm512[_mask[z]]_mm512_exp2a23_round_ps
_mm512[_mask]_mm512_exp_pd
_mm512[_mask]_mm512_exp_ps
_mm512[_mask]_mm512_exp10_pd
_mm512[_mask]_mm512_exp10_ps
_mm512[_mask]_mm512_exp2_pd
_mm512[_mask]_mm512_exp2_ps
_mm512[_mask]_mm512_expm1_pd
_mm512[_mask]_mm512_expm1_ps
_mm512[_mask]_mm512_pow_ps
_mm512[_mask]_pow_pd
reciprocal operations
test operations
Intrinsics for Test Operations
_mm512_mask_test_epi64_mask
_mm512_mask_testn_epi32_mask
_mm512_mask_testn_epi64_mask
_mm512_test_epi64_mask
_mm512_testn_epi32_mask
_mm512_testn_epi64_mask
typecast operations
Intrinsics for Typecast Operations
_mm512_castpd_ps
_mm512_castpd_si512
_mm512_castpd128_pd512
_mm512_castpd256_pd512
_mm512_castpd512_pd128
_mm512_castpd512_pd256
_mm512_castpd512_ps128
_mm512_castps_pd
_mm512_castps_si512
_mm512_castps128_ps512
_mm512_castps256_ps512
_mm512_castps512_ps128
_mm512_castps512_ps256
_mm512_castsi128_si512
_mm512_castsi256_si512
_mm512_castsi512_pd
_mm512_castsi512_ps
_mm512_castsi512_si128
_mm512_castsi512_si256
unpack operations
Intrinsics for Integer Pack and Unpack Operations
_mm512_[mask[z]_]unpackhi_epi32
_mm512_[mask[z]_]unpackhi_epi64
_mm512_[mask[z]_]unpackhi_pd
_mm512_[mask[z]_]unpackhi_ps
_mm512_[mask[z]_]unpacklo_epi32
_mm512_[mask[z]_]unpacklo_epi64
_mm512_[mask[z]_]unpacklo_pd
_mm512_[mask[z]_]unpacklo_ps
vector mask operations
Intrinsics for Vector Mask Operations
_mm512_kand
_mm512_kandn
_mm512_kmov
_mm512_knot
_mm512_kor
_mm512_kortestc
_mm512_kortestz
_mm512_kunpackb
_mm512_kxnor
_mm512_kxor
Intrinsics for Intel® Advanced Vector Extensions 512 (Intel® AVX-512)
Intrinsics for Intel® Advanced Vector Extensions 512 (Intel® AVX-512) BF16 Instructions
Intrinsics for Integer Expand and Load Operations
Intrinsics for Intel® Advanced Vector Extensions 512 (Intel® AVX-512) VPOPCNTDQ Instructions
BF16 Instructions
integer expand and load operations
Intrinsics for Integer Expand and Load Operations
_mm512_mask[z]_expand_epi32
_mm512_mask[z]_expand_epi64
_mm512_mask[z]_expandloadu_epi32
_mm512_mask[z]_expandloadu_epi64
VPOPCNTDQ Instructions
Intrinsics for Intel® Advanced Vector Extensions 512 (Intel® AVX-512) BW, DQ, and VL instructions
Intrinsics for Bit Manipulation Operations
Intrinsics for Comparison Operations
Intrinsics for Conversion Operations
Intrinsics for Load Operations
Intrinsics for Logical Operations
Intrinsics for Miscellaneous Operations
Intrinsics for Move Operations
Intrinsics for Set Operations
Intrinsics for Shift Operations
Intrinsics for Store Operations
bit manipulation operations
comparison operations
conversion operations
load operations
logical operations
miscellaneous operations
Move operations
set operations
shift operations
store operations
invoking Intel® C++ Compiler
IR
ivdep
IVDEP
High-Level Optimization (HLO)
effect when tuning applications
language extensions
GCC* Compatibility and Interoperability
g++*
gcc*
LD_LIBRARY_PATH
Level Zero (L0)
LIB environment variable
libgcc library
shared-libgcc
static-libgcc
option linking dynamically
option linking statically
libistrconv Library
Function List
Intel's Numeric String Conversion Library
Overview: Intel's Numeric String Conversion Library
Intel's Numeric String Conversion functions
Numeric String Conversion
Numeric String Conversion Functions
libm
libqkmalloc Library
libraries
creating
option letting you link to Intel® DAAL
option preventing linking with shared
option preventing use of standard
option printing location of system
redistributing
specifying
libraries
Creating Libraries
Managing Libraries
Using Intel Shared Libraries
-c compiler option
-fPIC compiler option
-shared compiler option
creating your own
LD_LIBRARY_PATH
managing
shared
Using Intel Shared Libraries
Creating Libraries
static
library
L
l
option searching in specified directory for
option to search for
Library extensions
C++ Library Extensions
valarray implementation
library functions
library math functions
fmath-errno
option testing errno after calls to
libstdc++ library
static-libstdc++
option linking statically
linear_index
linker
Xlinker
link
option passing linker option to
option passing options to
linker options
Passing Options to the Linker
specifying
linking
Linking Debug Information
Compilation Phases
option preventing use of startup files and libraries when
option preventing use of startup files when
option suppressing
linking debug information
linking tools
IPO-Related Performance Issues
Creating a Library from IPO Objects
Interprocedural Optimization (IPO)
xild
IPO-Related Performance Issues
Creating a Library from IPO Objects
Interprocedural Optimization (IPO)
xilibtool
xilink
IPO-Related Performance Issues
Interprocedural Optimization (IPO)
linking tools IR
linking with IPO
Linux* compiler options
Specifying Object Files
c
o
Linux* compiler options
Specifying Include Files
Specifying Assembly Files
I
S
X
loop unrolling
Programming Guidelines for Vectorization
using the HLO optimizer
loop_count
loops
Loop Constructs
constructs
dependencies
distribution
interchange
option specifying maximum times to unroll
parallelization
Vectorization and Loops
Programming with Auto-parallelization
transformations
vectorization
Vectorization and Loops
Vectorizing a Loop Using the _Simd Keyword
macro names
D
option associating with an optional value
macros
Additional Predefined Macros
GCC* Compatibility and Interoperability
Equivalent Macros
ISO Standard Predefined Macros
maintainability
makefiles
Modifying Your makefile
Modifying Your makefile
modifying
Modifying Your makefile
Modifying Your makefile
makefiles, using
managed and unmanaged code
Math Library
Overview: Intel® Math Library
code examples
function list
Complex Functions
Exponential Functions
Hyperbolic Functions
Miscellaneous Functions
Nearest Integer Functions
Remainder Functions
Special Functions
Trigonometric Functions
using
Math library
Complex Functions
Exponential Functions
Hyperbolic Functions
Miscellaneous Functions
Nearest Integer Functions
Remainder Functions
Special Functions
Trigonometric Functions
Complex Functions
cabs library function
cacos library function
cacosh library function
carg library function
casin library function
casinh library function
catan library function
catanh library function
ccos library function
ccosh library function
cexp library function
cexp10 library function
cimag library function
cis library function
clog library function
clog2 library function
conj library function
cpow library function
cproj library function
creal library function
csin library function
csinh library function
csqrt library function
ctan library function
ctanh library function
Exponential Functions
cbrt library function
exp library function
exp10 library function
exp2 library function
expm1 library function
frexp library function
hypot library function
ilogb library function
ldexp library function
log library function
log10 library function
log1p library function
log2 library function
logb library function
pow library function
scalb library function
scalbn library function
sqrt library function
Hyperbolic Functions
acosh library function
asinh library function
atanh library function
cosh library function
sinh library function
sinhcosh library function
tanh library function
Miscellaneous Functions
copysign library function
fabs library function
fdim library function
finite library function
fma library function
fmax library function
fmin library function
Miscellaneous Functions
nextafter library function
Nearest Integer Functions
ceil library function
floor library function
llrint library function
llround library function
lrint library function
lround library function
modf library function
nearbyint library function
rint library function
round library function
trunc library function
Remainder Functions
fmod library function
remainder library function
remquo library function
Special Functions
annuity library function
compound library function
erf library function
erfc library function
gamma library function
gamma_r library function
j0 library function
j1 library function
jn library function
lgamma library function
lgamma_r library function
tgamma library function
y0 library function
y1 library function
yn library function
Trigonometric Functions
acos library function
acosd library function
asin library function
asind library function
atan library function
atan2 library function
atand library function
atand2 library function
cos library function
cosd library function
cot library function
cotd library function
sin library function
sincos library function
sincosd library function
sind library function
tan library function
tand library function
memory model
mcmodel
option specifying large
option specifying small or medium
option to use specific
Message Fabric Interface (MPI) support
Microsoft Visual Studio*
Creating a New Project
Intel® Performance Libraries
property pages
Microsoft* Visual Studio*
Microsoft Compatibility
compatibility
integration
min_val
mixing vectorizable types in a loop
mock object files
MPI support
mpx
attribute
multithreaded programs
multithreading
MXCSR register
noblock_loop
nofusion
noinline
noparallel
noprefetch
normalized floating-point number
Not-a-Number (NaN)
nounroll
nounroll_and_jam
novector
object files
Specifying Object Files
specifying
omp simd early exit
omp simdoff
Open Source tools
optimization
Other Considerations
Other Considerations
option specifying code
optimization_level
optimization_parameter
optimizations
High-Level Optimization (HLO)
Od
Ot
Os
high-level language
option disabling all
option enabling all speed
option enabling many speed
optimize
output files
o
option specifying name for
overview
parallel
parallel pragma
Enabling Further Loop Parallelization for Multicore Platforms
lastprivate clause
private clause
parallelism
Using Intel Intel Libraries for oneAPI with Microsoft Visual Studio*
Changing the Selected Intel Libraries for oneAPI
Automatic Parallelization
parallelization
Automatic Parallelization
Programming with Auto-parallelization
performance
performance issues with IPO
porting applications
Overview: Porting from Microsoft* Visual C++* to the Intel® C++ Compiler
from the Microsoft* C++ Compiler
to the Intel® C++ Compiler
porting applications
Overview: Porting from gcc* to the Intel® C++ Compiler
from gcc* to the Intel® C++ Compiler
position-independent code
fpic
fpie
option generating
fpic
fpie
pragma alloc_section
alloc_section
var
pragma block_loop
block_loop/noblock_loop
factor
level
pragma code_align
pragma distribute_point
pragma forceinline
inline, noinline, forceinline
recursive
pragma inline
inline, noinline, forceinline
recursive
pragma intel_omp_task
pragma intel_omp_taskq
pragma ivdep
pragma loop_count
loop_count
avg
max
min
n
pragma noblock_loop
pragma nofusion
pragma noinline
pragma noparallel
pragma noprefetch
prefetch/noprefetch
var
pragma nounroll
pragma nounroll_and_jam
pragma novector
pragma omp simdoff
pragma optimization_level
optimization_level
GCC
intel
n
pragma optimization_parameter
optimization_parameter
target_arch
pragma optimize
optimize
off
on
pragma parallel
parallel/noparallel
always
firstprivate
lastprivate
num_threads
private
pragma prefetch
prefetch/noprefetch
distance
hint
var
pragma simd
User-Mandated or SIMD Vectorization
assert
firstprivate
lastprivate
linear
noassert
novecremainder
private
reduction
vecremainder
vectorlength
vectorlengthfor
pragma unroll
pragma unroll_and_jam
pragma unused
pragma vector
vector
aligned
always
mask_readwrite
nomask_readwrite
nontemporal
novecremainder
temporal
unaligned
vecremainder
Pragmas
Intel-supported Pragma Reference
Pragmas
gcc* compatible
HP* compatible
Intel-supported
Microsoft* compatible
overview
Pragmas: Intel-specific
precompiled header files
predefined macros
Additional Predefined Macros
GCC* Compatibility and Interoperability
ISO Standard Predefined Macros
preempting functions
prefetch
program loops
projects
Creating a New Project
adding files
creating
in Microsoft Visual Studio*
property pages in Microsoft Visual Studio*
Proxy
Proxy
ConstProxy
queue order properties
redistributable package
redistributing libraries
release configuration
remarks
Werror-all
option changing to errors
response files
restricted transactional memory
Intrinsics for Restricted Transactional Memory Operations
Intrinsics for Intel® Transactional Synchronization Extensions (Intel® TSX)
RTM
Function Prototype and Macro Definitions
function prototypes
macro definitions
run-time environment variables
run-time performance
Overview: Tuning Performance
improving
SDLT
Example 4
Example 3
Example 2
accessors
Bounds
Accessors
example programs
Example 1
Examples
Example 5
indexes
number representation
proxy objects
SDLT_DEBUG
SDLT_INLINE
SDLT Layouts
Layouts
sdlt layout namespace
setting options
Setting Options for a Project or File
in Eclipse*
setvars.bat
setvars.csh
setvars.sh
shared libraries
shared object
shared
option producing a dynamic
Short Vector Math Library (SVML) Intrinsics
Overview: Intrinsics for Short Vector Math Library (SVML) Functions
overview
Short Vector Random Number Generator Library
signed infinity
signed zero
simd
vectorization
simd
function annotations
SIMD-enabled functions
SIMD-Enabled Functions
pointers to
SMP systems
soa1d_container
soa1d_container::accessor
Accessor Concept
n_bounds_generator
n_bounds_t
soa1d_container::accessor and aos1d_container::accessor
bounds_t
bounds_d Template Function
sdlt::bounds Template Function
soa1d_container::const_accessor
specifying file names
Specifying Object Files
for object files
specifying file names
Specifying Assembly Files
for assembly files
stack checking routine
Gs
option controlling threshold for call of
stack variables
ftrapuv
option initializing to NaN
standard directories
X
option removing from include search path
standards conformance
static libraries
sub-groups for NDRange parallelism
subnormal numbers
Supplemental Streaming SIMD Extensions 3
Absolute Value Intrinsics
Addition Intrinsics
Concatenate Intrinsics
Multiplication Intrinsics
Negation Intrinsics
Overview: Supplemental Streaming SIMD Extensions 3 (SSSE3)
Shuffle Intrinsics
Subtraction Intrinsics
absolute value intrinsics
addition intrinsics
concatenate intrinsics
multiplication intrinsics
negation intrinsics
Negation Intrinsics
_mm_sign_epi16
_mm_sign_epi32
_mm_sign_epi8
_mm_sign_pi16
_mm_sign_pi32
_mm_sign_pi8
overview
shuffle intrinsics
_mm_shuffle_epi8
_mm_shuffle_pi8
subtraction intrinsics
Subtraction Intrinsics
_mm_hsub_epi16
_mm_hsub_epi32
_mm_hsub_pi16
_mm_hsub_pi32
_mm_hsubs_epi16
_mm_hsubs_pi16
supported tools
SVML
SYCL_INTEL_unnamed_kernel_lambda
synchronization
thread pooling
threads
Using Intel Intel Libraries for oneAPI with Microsoft Visual Studio*
Changing the Selected Intel Libraries for oneAPI
threshold control for auto-parallelization
Programming Guidelines for Vectorization
reordering
to Microsoft Visual Studio* projects
unified shared memory
unroll
unroll/nounroll
n
unroll_and_jam
unroll_and_jam/nounroll_and_jam
n
unused
unwind information
fasynchronous-unwind-tables
option determining where precision occurs
user functions
Developer Directed Inline Expansion of User Functions
Compiler Directed Inline Expansion of Functions
auto-parallelization
using
Using Configuration Files
Using Response Files
using Intel® Performance Libraries
Using Intel Libraries for oneAPI with Eclipse*
in Eclipse*
using property pages in Microsoft Visual Studio*
valarray implementation
Using Intel's valarray Implementation
compiling code
using in code
variables
fzero-initialized-in-bss, Qzero-initialized-in-bss
fkeep-static-consts
option placing explicitly zero-initialized in DATA section
option saving always
vector
vector copy
Programming Guidelines for Vectorization
non-vectorizable copy
programming guidelines
vectorization
Using Automatic Vectorization
compiler options
compiler pragmas
keywords
obstacles
speed-up
what is
Vectorization
Programming Guidelines for Vectorization
Function Annotations and the SIMD Directive for Vectorization
simd
User-Mandated or SIMD Vectorization
auto-parallelization
Programming Guidelines for Vectorization
reordering threshold control
general compiler directives
Intel® Streaming SIMD Extensions
language support
loop unrolling
pragma
pragma simd
SIMD
user-mandated
vector copy
Programming Guidelines for Vectorization
non-vectorizable copy
programming guidelines
vectorizing
Loop Constructs
loops
visibility declaration attribute
Visual Studio*
Selecting a Configuration
Build a Project
Specifying Directory Paths
Selecting the Compiler Version
Options: Compilers dialog box
Options: Intel Libraries for oneAPI dialog box
Changing the Selected Intel Libraries for oneAPI
Including MPI Support
build configuration
build options
building with Intel® C++
changing directory paths
compiler selection
debug configuration
dialog boxes
Options: Compilers dialog box
Options: Intel Libraries for oneAPI dialog box
Compilers
Intel® Performance Libraries
Intel® Performance Libraries
MPI support
release configuration
warnings
GCC-Compatible Warning Options
Werror-all
Werror, WX
gcc-compatible
option changing to errors
Werror-all
Werror, WX
warnings and errors
whole program analysis
Windows* compiler options
Specifying Object Files
Specifying Include Files
Fo
I
X
Windows* compiler options
Specifying Assembly Files
Fa
worksharing
xiar
IPO-Related Performance Issues
Creating a Library from IPO Objects
xild
IPO-Related Performance Issues
Creating a Library from IPO Objects
Interprocedural Optimization (IPO)
xilib
xilibtool
xilink
IPO-Related Performance Issues
Creating a Library from IPO Objects
Interprocedural Optimization (IPO)