Developer Guide

Contents

MKL_NUM_STRIPES

The
MKL_NUM_STRIPES
environment variable controls the
Intel® MKL
threading algorithm for
?gemm
functions. When
MKL_NUM_STRIPES
is set to a positive integer value
nstripes
,
Intel® MKL
tries to use a number of partitions equal to
nstripes
along the leading dimension of the output matrix.
The following table explains how the value
nstripes
of
MKL_NUM_STRIPES
defines the partitioning algorithm used by
Intel® MKL
for
?gemm
output matrix;
max_threads_for_mkl
denotes the maximum number of OpenMP threads for
Intel® MKL
:
Value of
MKL_NUM_STRIPES
Partitioning Algorithm
1 <
nstripes
< (
max_threads_for_mkl
/2)
2D partitioning with the number of partitions equal to
nstripes
:
  • Horizontal, for column-major ordering.
  • Vertical, for row-major ordering.
nstripes
= 1
1D partitioning algorithm along the opposite direction of the leading dimension.
nstripes
≥ (
max_threads_for_mkl
/2)
1D partitioning algorithm along the leading dimension.
nstripes
< 0
The default
Intel® MKL
threading algorithm.
The following figure shows the partitioning of an output matrix for
nstripes
= 4 and a total number of 8 OpenMP threads for column-major and row-major orderings:
You can use support functions
mkl_set_num_stripes
and
mkl_get_num_stripes
to set and query the number of stripes, respectively.
Optimization Notice
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice revision #20110804
This notice covers the following instruction sets: SSE2, SSE4.2, AVX2, AVX-512.

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804