Verbose Mode Supported in Intel® MKL

Introduction:

We Introduced a useful verbose mode support feature since the Intel® Math Kernel Library (Intel® MKL) 11.2, for BLAS and LAPACK domains.

In latest 2018 version, we introduce supporting verbose mode for Fourier Transform functions (FFT) domain.

This feature enables developers to better understand Intel MKL function run-time usage in their programs. Verbose mode support provides the ability to extract information related to the version of Intel MKL used and the instruction set supported by run-time processor, the Intel MKL functions called and the parameters passed to them, and the amount of time spent in each function call

Using Intel® MKL Verbose Mode

To enable the Intel MKL Verbose mode for an application, do one of the following:

•  Set the environment variable MKL_VERBOSE  to 1 

•  Call the support function mkl_verbose(1)

By default the verbose mode is disabled. When it is on, every call of a verbose-enabled function finishes with printing verbose log, including the list of version Information, the name of function, values of the arguments, time taken by the function and others.

Example 1: Using Verbose Mode for DGEMM (double Precision matrix*matrix)

The following is an example of calling matrix*matrix function dgemm() function and switch on the option MKL_VERBOSE and get the run-time information of dgemm

 The version information line:

MKL_VERBOSE Intel(R) MKL 11.2 build 20140312 for Intel(R) 64 architecture Intel(R) Advanced Vector Extensions (Intel(R) AVX) enabled processors, Lnx 2.70GHz lp64 intel_thread NMICDev:0

The information indicates that the current MKL version is 11.2 , the type of processor is Intel(R) AVX enabled , Operating System is Linux , CPU Frequency is 2.70GHz, it is using lp64 interface and thread MKL library,  and not using a Co Processor

And call description line:

MKL_VERBOSE DGEMM(N,N,1000,1000,1000,0x7fff10ff6560,0x7f9d09f20010,1000,0x7f9d0a6c2010,1000,0x7fff10ff6568,0x7f9d0977e010,1000) 15.79ms CNR:OFF Dyn:1 FastMM:1 TID:0  NThr:16 WDiv:HOST:+0.000

The line show, the program is using DGEMM with the input parameter: N,N,1000,1000,1000,0x7fff10ff6560,0x7f9d09f20010,1000,0x7f9d0a6c2010,1000,0x7fff10ff6568,0x7f9d0977e010,1000.  It takes 15.79ms. The environment MKL_CBWR is OFF and MKL_DYNAMIC and FastMemory Manager is on. The print thread ID is 0. And the total used 16 threads. Ignore the WDiv:HOST:+0.000 as it is for coprocessor.

Example 2:  Using Verbose Mode for 2D real FFT  

The following is an example of calling FFT functions.  Build the fft program and product binary. Before run the binary, set MKL_VERBOSE=1. The verbose information in the program will be shown up:

MKL_VERBOSE Intel(R) MKL 2018.0 Update 1 Product build 20170712 for Intel(R) 64 architecture Intel(R) Advanced Vector Extensions 2 (Intel(R) AVX2) enabled processors, Lnx 2.30GHz gnu_thread NMICDev:0
MKL_VERBOSE FFT: MAIN_DESC | sro4:5:3x5:1:1 | THR_LIMIT = 1 | 0.00s CNR:OFF Dyn:1 FastMM:1 TID:0  NThr:36 WDiv:HOST:+0.000

sro4:5:3x5:1:1 | THR_LIMIT = 1 should be translated as:
s/d for Single/Double precision
c/r for Complex/Real domain
i/o for InPlace/Out_of_Place placement in output
4:5:3x5:1:1  for Dimensions with their length and input/output strides goes from the biggest dimension to smallest dimension. In example provided,
4:5:3 for 2rd dimension, 4 is size of 2rd dimension, 5 is input stride and 3 is output stride for 2rd dimension;
"x" is a delimiter between dimensions;
and 5:1:1 is for 1st dimension,  5 is size of 1st dimension. 1 is input stride and another 1 is output stride
​(if more dimension, and so on);
sTHR_LIMIT is maximum number of threads which will be used for computing.

F/B SCALE may appear. This is for Forward/Backward scaling factor. Default values of 1 are not printed. (example, F/B SCALE = 1/0.5, means Backward scale equals 0.5 )
PACK may appear. This is for pack format for real domain. Mostly common CCE packing is not printed. (example, PACK = CCS, means that CCS packing format is used)
Else parameters that are allowed to be set from Intel(R) MKL FFT API will be printed if non-default values were used.

Else line values are same as for BLAS or LAPACK except one should ignore time which is always 0.

Some Limitation

Because every call to a verbose-enabled function requires an output operation, the performance of the application may degrade with the verbose mode enabled.

Besides of this, MKL Verbose mode has the following limitations:

• Input values of parameters passed by reference are not printed if the values were changed by the function.

For example, if a LAPACK function is called with a workspace query, that is, the value of the lwork parameter equals -1 on input, the call description line prints the result of the query and not -1.

• Return values of functions are not printed.

For example, the value returned by the function ilaenv is not printed.

• Floating-point scalars passed by reference are not printed.

Please see the MKL User Guide for more details about the verbose mode of MKL.

For more complete information about compiler optimizations, see our Optimization Notice.
AttachmentSize
Image icon a_0.jpg83.39 KB