OpenMP*
Threaded Functions and Problems
 Direct sparse solver.
 LAPACK.For a list of threaded routines, see LAPACK Routines.
 Level1 and Level2 BLAS.For a list of threaded routines, see BLAS Level1 and Level2 Routines.
 All Level 3 BLAS and all Sparse BLAS routines except Level 2 Sparse Triangular solvers.
 All Vector Mathematics functions (except service functions).
 FFT.For a list of FFT transforms that can be threaded, see Threaded FFT Problems.
Optimization Notice


Intel's compilers may or may not optimize to the same degree for nonIntel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessordependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice revision #20110804

LAPACK Routines
 Linear equations, computational routines:
 Factorization:?getrf, ?getrfnpi, ?gbtrf, ?potrf, ?pptrf, ?sytrf, ?hetrf, ?sptrf, ?hptrf
 Solving:?dttrsb, ?gbtrs, ?gttrs, ?pptrs, ?pbtrs, ?pttrs, ?sytrs, ?sptrs, ?hptrs, ?tptrs, ?tbtrs
 Orthogonal factorization, computational routines:?geqrf, ?ormqr, ?unmqr, ?ormlq, ?unmlq, ?ormql, ?unmql, ?ormrq, ?unmrq
 Singular Value Decomposition, computational routines:?gebrd, ?bdsqr
 Symmetric Eigenvalue Problems, computational routines:?sytrd, ?hetrd, ?sptrd, ?hptrd, ?steqr, ?stedc.
 Generalized Nonsymmetric Eigenvalue Problems, computational routines:chgeqz/zhgeqz.
Threaded BLAS Level1 and Level2 Routines
 Level1 BLAS:?axpy, ?copy, ?swap, ddot/sdot, cdotc, drot/srot
 Level2 BLAS:?gemv, ?trsv, ?trmv, dsyr/ssyr, dsyr2/ssyr2, dsymv/ssymv
Threaded FFT Problems
 rank
 domain
 size/length
 precision (single or double)
 placement (inplace or outofplace)
 strides
 number of transforms
 layout (for example, interleaved or split layout of complex data)
Architecture
 Conditions


Intel® 64
 N is a power of 2,
log _{2}(N ) > 9, the transform is doubleprecision outofplace, and input/output strides equal 1.

IA32
 N is a power of 2,
log _{2}(N ) > 13, and the transform is singleprecision.

N is a power of 2,
log _{2}(N ) > 14, and the transform is doubleprecision.
 
Any
 N is composite,
log _{2}(N ) > 16, and input/output strides equal 1.
