Tutorial

  • 327357-009
  • 04/15/2019
  • Public Content
  • Download as PDF

Measuring Effect of Threading on dgemm

By default, Intel MKL uses
n
threads, where
n
is the number of physical cores on the system. By restricting the number of threads and measuring the change in performance of
dgemm
, this exercise shows how threading impacts performance.

Limit the Number of Cores Used for dgemm

This exercise uses the
mkl_set_num_threads
routine to override the default number of threads, and
mkl_get_max_threads
to determine the maximum number of threads.
* Fortran source code is found in dgemm_threading_effect_example.f PRINT *, "Finding max number of threads Intel(R) MKL can use for" PRINT *, "parallel runs" PRINT *, "" MAX_THREADS = MKL_GET_MAX_THREADS() PRINT 20," Running Intel(R) MKL from 1 to ",MAX_THREADS," threads" 20 FORMAT(A,I2,A) PRINT *, "" DO L = 1, MAX_THREADS DO I = 1, M DO J = 1, N C(I,J) = 0.0 ENDDO ENDDO PRINT 30, " Requesting Intel(R) MKL to use ",L," thread(s)" 30 FORMAT(A,I2,A) CALL MKL_SET_NUM_THREADS(L) PRINT *, "Making the first run of matrix product using " PRINT *, "Intel(R) MKL DGEMM subroutine to get stable " PRINT *, "run time measurements" PRINT *, "" CALL DGEMM('N','N',M,N,K,ALPHA,A,M,B,K,BETA,C,M) PRINT *, "Measuring performance of matrix product using " PRINT 40, " Intel(R) MKL DGEMM subroutine on ",L," thread(s)" 40 FORMAT(A,I2,A) PRINT *, "" S_INITIAL = DSECND() DO R = 1, LOOP_COUNT CALL DGEMM('N','N',M,N,K,ALPHA,A,M,B,K,BETA,C,M) END DO S_ELAPSED = (DSECND() - S_INITIAL) / LOOP_COUNT PRINT *, "== Matrix multiplication using Intel(R) MKL DGEMM ==" PRINT 50, " == completed at ",S_ELAPSED*1000," milliseconds ==" PRINT 60, " == using ",L," thread(s) ==" 50 FORMAT(A,F12.5,A) 60 FORMAT(A,I2,A) PRINT *, "" END DO
Examine the results shown and notice that time to multiply the matrices decreases as the number of threads increases. If you try to run this exercise with more than the number of threads returned by
mkl_get_max_threads
, you might see performance degrade when you use more threads than physical cores.
You can see specific performance results for
dgemm
at the Details tab at http://software.intel.com/en-us/articles/intel-mkl.

Product and Performance Information

1

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804