Intel® MKL 10.3 Release Notes

Please see the following links available online for the latest information regarding the Intel MKL library:

Links to documentation, help, and code samples can be found on the main Intel MKL product page. For technical support visit the Intel MKL technical support forum and review the articles in the Intel MKL knowledgebase. When using Intel MKL for the first time the Link Line Advisor can be particularly helpful for finding the right libraries to link.

Please register your product using your preferred email address. This helps Intel recognize you as a valued customer in the support forum and insures that you will be notified of product updates. You can read Intel's Online Privacy Notice Summary if you have any questions regarding the use of your email address for software product registration.

What's New in Intel® MKL 10.3 update 12

  • FFT: Improved dynamic adjustment of the number of threads for single precision 2D FFT

What's New in Intel® MKL 10.3 update 11

  • BLAS: Improved Level 1 BLAS on 32-bit Microsoft Windows XP* for Intel® Xeon® processor 5400 series (Harpertown)
  • LAPACK: Introduced support for LAPACK version 3.4.1
  • FFT: Added additional optimization for AVX/AVX2 which gives significant performance improvement for complex-to-complex FFTs of sizes 8, 12, 14, 21, and 96
  • VSL: Improved performance of viRngGeometric on Intel® Advanced Vector Extensions (AVX)
  • mklvars.* script no longer set $FPATH in environment and internal variable MKL_TARGET_ARCH will not be exported. This change will not impact users as the Intel compiler no longer requires the $FPATH variable
  • Bug fixes

What's New in Intel® MKL 10.3 update 10

  • BLAS: Improved dznrm2 and dnrm2 performance for 32-bit programs supporting Intel® Advanced Vector Extensions (Intel® AVX)
  • LAPACK: Introduced support for LAPACK version 3.4.0
  • FFT: Added Intel AVX optimizations for 1D/2D/3D transforms on Mac OS* X
  • Data Fitting: Improved performance of SearchCells1D() function on Intel® Xeon® E7-4870 and E5-2690 processors for:
    • Arbitrary non-uniform and quasi-uniform partitions where the number of interpolation sites are greater than 32
    • All types of partitions where the number of interpolation sites is fewer than 32
  • Bug fixes

What's New in Intel® MKL 10.3 update 9

  • LAPACK: Improved [C/Z]GEEV performance for very small sizes (~10x10)
  • FFTs: Threaded the real in-place 1D FFTs for a significant increase in performance
  • FFTs: Introduced new algorithms for improved scalability of power-of-2 double-precision complex 1D FFTs on Intel® Xeon® Processor E5 series systems running 32-bit operating systems
  • Random number Generators: added support for a non-deterministic random number generator based on the RdRand instruction and supporting hardware available in processors based on the Intel codename "Ivy Bridge" microarchitecture
  • Vector Math Functions: improved performance of the Erf() and Pow3o2() functions on Intel® Core™ processors
  • Data Fitting: Improved performance of routines for spline-based evaluation, differentiation, and integration on Intel Xeon 5600 and 7500 series and Intel Core i7-2600 series processors
  • Introduced support for version 11.10 of the PGI* C and Fortran compilers
  • Bug fixes

What's New in Intel® MKL 10.3 update 8

  • Data Fitting component: Added a set of new data fitting functions covering one-dimensional algorithms for vector spline construction, cell or bin search, and evaluation, differentiation, and integration of the spline interpolants. Includes support for:
    • Linear, quadratic, cubic, step-wise const, and user-defined splines
    • Cell search with configuration parameters for optimal performance
    • User-defined interpolation and extrapolation
    • Vector-valued functions
    • Column- and row-major storage formats
  • Sparse BLAS: Improved compressed sparse row matrix-vector multiply (?CSRMV) performance for very sparse matrices on high core counts supporting Intel Advanced Vector Extensions (AVX)
  • FFTs: Improved the performance of the 1D double precision FFTs on systems supporting Intel AVX
  • Statistics functions: Improved the performance and scalability for computing the Variance-Covariance and Correlation matrices (FAST method) on Intel® Core processors
  • Added a Microsoft* Visual Studio* project tool for building custom DLLs from static library files
  • Bug fixes

What's New in Intel® MKL 10.3 update 7

  • BLAS: Improved DSYRK/SSYRK threaded performance for small output matrices and large outer products (i.e., rectangular input matrices), on all recent Intel® Xeon® processors
  • BLAS: Improved ?GEMM performance for small problems (<10) where beta =1 on all recent Intel Xeon processors
  • BLAS: Improved DSCAL performance for small problems and for cases where INCX=1 on 32-bit programs running on Intel Xeon processors 5500, 5600, and 7500 series
  • BLAS-like extensions: Improved threading and cache utilization of in-place transposition of square matrices
  • PARDISO: Introduced an independent threading control for PARDISO; use MKL_DOMAIN_PARDISO with the mkl_domain_set_num_threads() function
  • Poisson Library: Added support for 2D and 3D periodic boundary conditions
  • Included the Link Line Advisor in the documentation directory
  • Added a command line link tool for use with scripting tools such as libtool
  • Added C header files with stdcall prototypes for functions in the following components: BLAS, Sparse BLAS, LAPACK, PARDISO/DSS, RCI Iterative Solvers, Vector Mathematical Functions, Vector Statistical Functions, and the support functions
  • Changed the names of constants used to specify the domain in the mkl_domain_set_num_threads() function (e.g., MKL_BLAS has becomes MKL_DOMAIN_BLAS); the old names still exist with the exception of MKL_PARDISO
  • Bug fixes

What's New in Intel® MKL 10.3 update 6

  • Sparse BLAS: Added a new option to the mkl_?csrbsr converter function allowing detection and removal of zero elements when converting from the BSR format to the CSR format
  • Changed DLL loading behavior on Windows*: Intel MKL DLLs can no longer be in separate directories on the PATH-they must all be in the same directory with the executable or in a directory specified in the PATH environment variable
  • Bug fixes

What's New in Intel® MKL 10.3 update 5

  • BLAS: Improved performance: {S,C,Z}TRSM for processors with Intel® Advanced Vector Extensions (Intel® AVX); {S,D}GEM2VU for processors with Intel AVX as well as the Intel® Core™ i7 processor and the Intel® Xeon® processor 5500 series
  • BLAS: Improved scaling: ?TRMV for large matrices on all architectures; DGEMM for odd numbers of threads on Intel® Xeon® processor 5400 series
  • LAPACK: Included LAPACK 3.3.1 extensions and the respective LAPACKE interfaces
  • LAPACK: Improved the performance of ?SYGST and ?HEGST used in generalized eigenvalue problems
  • LAPACK: Improved the performance of the inverse of an LU factored matrix (?GETRI)
  • PARDISO: Added transpose and conjugate transpose solve capability (ATx=b and AHx=b); facilitates compressed sparse column (CSC) format support
  • PARDISO: Improve out-of-core PARDISO performance when the memory requirements slightly exceed available memory using MKL_PARDISO_OOC_MAX_SWAP_SIZE environment variable and in-core PARDISO
  • Optimization Solvers: Added Inf and NaN checks in the RCI Trust-Region solvers
  • FFTs: Improved the performance of 3D FFTs on small cubes from 2x2x2 to 10x10x10 for all supported precisions and types on all Intel® processors supporting Intel® SSE3 and later
  • FFT examples: Re-designed example programs to cover common use cases for Intel MKL DFTI and FFTW
  • VSL: Improved the performance of the single precision MT19937 and MT2203 basic random number generators on the Intel® Core™ i7-2600 processor on 64-bit operating systems
  • VSL: Improved the performance of the integer version of the SOBOL quasi-random number generator on the Intel® Core™ i7-2600 processor and Intel® Xeon® processor 5400 series
  • Bug fixes

What's New in Intel® MKL 10.3 update 4

  • BLAS: Improved DTRMM performance on Intel® Xeon® processors 5400 and later
  • BLAS: Improved DTRSM performance on all 64-bit enabled processors, especially processors with Intel® Advanced Vector Extensions (Intel® AVX)
  • LAPACK: Incorporated bug fixes from the LAPACK 3.3.1 release
  • OOC PARDISO: Improved the estimate of the amount of memory needed in out-of-core operation
  • FFT: Improved 1D real FFT scaling through improved threading
  • FFT: Updated C and Fortran FFT examples to use the new single dynamic library linking model
  • VML: Improved performance of the single precision Enhanced Performance version of the real Hypot and complex Abs functions and of the complex Arg, Div, Mul, MulByConj functions for all accuracy modes on Intel® Xeon® processors 5600 and 7500 series, and the Intel® Core™ i7-2600 processor
  • Service functions: Improvements and additions to the Intel MKL service functions
    • Improved mkl_mem_stat() to gather all memory statistics
    • Added new timing functions: mkl_get_max_cpu_frequency() and mkl_get_clocks_frequency()
    • Changed mkl_get_cpu_frequency() to return current CPU Frequency in GHz
  • Bug fixes

What's New in Intel® MKL 10.3 update 3

  • BLAS: Improved multi-threaded performance of DSYRK, DTRSM, and DGEMM on Intel® Xeon® processor 5400 series running 32-bit Windows*
  • LAPACK: Implemented LAPACK 3.3 from netlib including Cosine-Sine decomposition, improved linear equations solvers for symmetric and Hermitian matrices and auxiliary functions
  • PARDISO: 0-based permutation vectors are now allowed at input
  • PARDISO: Documentation for the pardisoinit() routine
  • PARDISO: Improved performance of serial PARDISO with multiple right-hand sides (RHS)
  • PARDISO: Independent control for parallelism in the solve step for improved performance on small matrices-see description of iparm(25)
  • PARDISO: Reduced backward substitution-allows partial solution computation for a full RHS-see description of iparm(31)
  • FFT: Implemented Real FFT transforms for 3 to 7 dimensions
  • FFT: Parallelized multi-dimensional complex transforms using split-complex data represented as two real arrays
  • Cluster FFTs: Extended FORTRAN 90 interface to real-to-complex transforms and included new examples
  • VML: Added new complex Pack/Unpack functions and real Gamma/LGamma functions
  • VML: Improved performance on Intel® Xeon® processor 5600 series and processors supporting Intel® Advanced Vector Extensions (Intel® AVX) for the following: all functions when operating on short vectors (<100), all functions when operating on unaligned input vectors, the sPow2o3 function, and the enhanced performance (EP) version of complex Add and Sub
  • VSL: Functions for saving/restoring random number generator (RNG) streams to/from memory
  • VSL: Added new UniformBits32 and UniformBits64 functions
  • VSL: Extended the number of unique streams supported by the MT2203 Basic RNG from 1024 to 6024
  • Bug fixes

What's New in Intel® MKL 10.3 update 2

  • BLAS: Improved performance of transposition functions on the Intel® Xeon® processor 5600 series
  • BLAS: Added examples for transposition routines
  • FFT: Added Fortran examples showing how to reduce application footprint by linking only functions with the desired precision
  • FFT: Added check for stride consistency on in-place real transforms with CCE storage
  • FFT: Expanded threading to new cases for multi-dimensional transforms
  • VSL: Improved performance of Multivariate Gaussian random number generator for single- and double-precision on 4-core Intel® Xeon® processors 5500 series
  • VML: Improved performance of in-place operation of Add, Mul, and Sub functions on the Intel® Xeon® processor 5500 series
  • Bug fixes

What's New in Intel® MKL 10.3 update 1

  • PARDISO/DSS: Added true F90 overloaded API (see the Intel MKL reference manual for more information)
  • PARDISO: Improved the statistical reporting to be more reader friendly
  • Sparse BLAS: Improved performance of ?BSRMM functions on Intel® Core™ i7 processors
  • FFTs: Support for negative strides
  • FFT examples: Added examples for split-complex FFTs in C and Fortran using both the DFTI and FFTW3 interfaces
  • VML: Improved performance of real in-place Add/Sub/Mul/Sqr functions on systems supporting SSE2 and SSE3
  • Poisson Library: Changed the default behavior of the Poisson library functions from sequential to threaded operation
  • Bug fixes

What's New in the Intel® MKL 10.3

  • BLAS
    • New functions for computing 2 matrix-vector products at once: [D/S]GEM2VU, [Z/C]GEM2VC
    • New functions for computing mixed precision general matrix-vector products: [DZ/SC]GEMV
    • New function for computing the sum of two scaled vectors: *AXPBY
    • Intel® AVX optimizations in key functions: SMP LINPACK, level 3 BLAS, DDOT, DAXPY
  • LAPACK
    • New C interfaces for LAPACK supporting row-major ordering
    • Integrated Netlib LAPACK 3.2.2 including one new computational routine (*GEQRFP) and two new auxiliary routines (*GEQR2P and *LARFGP) and the earlier LAPACK 3.2.1 update
    • Intel® AVX optimizations in key functions: DGETRF, DPOTRF, DGEQRF
  • PARDISO
    • Improved performance of factor and solve steps in multi-core environments
    • Introduced the ability to solve for sparse right-hand sides and perform partial solves-produces partial solution vector
    • Improved performance of the out-of-core (OOC) factorization step
    • Support for zero-based (C-style) array indexing
    • Zeros on the diagonal of the matrix are no longer required in sparse data structures for symmetric matrices
    • New ILP64 PARDISO interface allows the use of both LP64 and ILP64 versions when linked to the LP64 libraries
    • The memory required for storing files on the disk in OOC mode can now be estimated just after reordering
  • Sparse BLAS
    • Format conversion functions now support all data types (single and double precision for real and complex data) and can return sorted or unsorted arrays
  • FFTs
    • New MPI FFTW 3.3alpha1 wrappers cover new cluster functionality
    • Improved load-balancing of cluster FFTs provides improved performance
    • Intel AVX optimizations in all 1D/2D/3D FFTs
    • Improved performance of 2D and 3D mixed-radix FFTs for single and double precision data for all systems supporting the SSE4.2 instruction set
    • Support for split-complex data represented as two real arrays introduced for 2D/3D FFTs
    • Support for 1D complex-to-complex transforms of large prime lengths
    • Introduced Hybrid parallelism (MPI + OpenMP*) on cluster 1D complex transforms and increased performance on vector lengths which are a multiple of the number of MPI processes
  • VML
    • A new function for computing (ax+b)/(cy+d) where a, b, c, and d are scalars, and x and y are real vectors: v[s/d]LinearFrac()
    • Intel AVX optimizations for real functions
    • A new mode for setting denormals to zero, overflow support for complex vectors, and for every VML function a new function with an additional parameter for setting the accuracy mode
  • VSL
    • A set of new Summary Statistics functions was added covering basic statistics, covariance and correlation, pooled, group, partial, and robust covariance/correlation, quantiles and streaming quantiles, outliers detection algorithm, and missing values support
      • Performance optimized algorithms: MI algorithm for support of missing values, TBS algorithm for computation of robust covariance, BACON algorithm for detection of outliers, ZW algorithm for computation of quantiles (streaming data case), and 1PASS algorithm for computation of pooled covariance
    • Improved performance of SFMT19937 Basic Random Number Generator (BRNG)
    • Intel® AVX optimizations: MT19937 and MT2203 BRNGs
  • Documentation: Product documentation is available in the Microsoft Help Viewer* 1.x format that integrates with Microsoft Visual Studio* 2010
  • Added runtime dispatching dynamic libraries allowing link to a single interface library which loads dependent libraries dynamically at runtime depending on runtime CPU detection and/or library function calls
  • The custom dynamic libraries builder now uses the runtime dispatching dynamic libraries on the Linux* and Mac OS* X operating systems
  • A new directory structure has been established to simplify integration of Intel MKL with the Intel® Parallel Studio XE family of products and directories formerly designated as "em64t" are now designated by the "intel64" tag
  • Intel® Itanium® architecture (IA-64) support is not included in this release. Intel® MKL 10.2 is the latest release for IA-64
  • The sparse solver functionality has been fully integrated into the core Intel MKL libraries and the libraries with "solver" in the filename have been removed from the product

Notices

  • The Intel MKL GNU Multiple Precision* (GMP) function interfaces will be removed in a future library release.
  • The timing function mkl_set_cpu_frequency() is deprecated. Please use mkl_get_max_cpu_frequency(), mkl_get_clocks_frequency(), and mkl_get_cpu_frequency() as described in the Intel® MKL Reference Manual.
  • The MKL_PARDISO constant defined to specify the PARDISO domain should no longer be used with the mkl_domain_set_num_threads() function; please use MKL_DOMAIN_PARDISO instead.
  • Convolution and Correlation routines will not be backward compatible with Intel MKL 10.2 update 3 in a future release.
  • The OpenMP* static runtime library in Intel MKL for Windows* will be removed in a future library release.

For more information on deprecated features, please see the deprecations knowledgebase article.

Product Contents

The Intel® Math Kernel Library (Intel® MKL) version 10.3 consists of three installation packages: one package for both IA-32 and Intel® 64 architectures, one for IA-32 only, and one for Intel® 64 architecture only.

Technical Support

If you did not register your Intel software product during installation, please do so now at the Intel® Software Development Products Registration Center. Registration entitles you to free technical support, product updates and upgrades for the duration of the support term.

For general information about Intel technical support, product updates, user forums, FAQs, tips and tricks and other support questions, please visit http://www.intel.com/software/products/support/.

Note: If your distributor provides technical support for this product, please contact them rather than Intel.

For technical information about Intel MKL, including FAQ's, tips and tricks, and other support information, please visit the Intel MKL forum: /en-us/forums/intel-math-kernel-library/ and browse the Intel MKL knowledge base: /en-us/articles/intel-mkl-kb/all/.

Attributions

As referenced in the End User License Agreement, attribution requires, at a minimum, prominently displaying the full Intel product name (e.g. "Intel® Math Kernel Library") and providing a link/URL to the Intel® MKL homepage (http://www.intel.com/software/products/mkl) in both the product documentation and website.

The original versions of the BLAS from which that part of Intel® MKL was derived can be obtained from http://www.netlib.org/blas/index.html.

The original versions of LAPACK from which that part of Intel® MKL was derived can be obtained from http://www.netlib.org/lapack/index.html. The authors of LAPACK are E. Anderson, Z. Bai, C. Bischof, S. Blackford, J. Demmel, J. Dongarra, J. Du Croz, A. Greenbaum, S. Hammarling, A. McKenney, and D. Sorensen. Our FORTRAN 90/95 interfaces to LAPACK are similar to those in the LAPACK95 package at http://www.netlib.org/lapack95/index.html. All interfaces are provided for pure procedures.

The original versions of ScaLAPACK from which that part of Intel® MKL was derived can be obtained from http://www.netlib.org/scalapack/index.html. The authors of ScaLAPACK are L. S. Blackford, J. Choi, A. Cleary, E. D'Azevedo, J. Demmel, I. Dhillon, J. Dongarra, S. Hammarling, G. Henry, A. Petitet, K. Stanley, D. Walker, and R. C. Whaley.

PARDISO in Intel® MKL is compliant with the 3.2 release of PARDISO that is freely distributed by the University of Basel. It can be obtained at http://www.pardiso-project.org.

Some FFT functions in this release of Intel® MKL have been generated by the SPIRAL software generation system (http://www.spiral.net/) under license from Carnegie Mellon University. The Authors of SPIRAL are Markus Puschel, Jose Moura, Jeremy Johnson, David Padua, Manuela Veloso, Bryan Singer, Jianxin Xiong, Franz Franchetti, Aca Gacic, Yevgen Voronenko, Kang Chen, Robert W. Johnson, and Nick Rizzolo.

 

如需更全面地了解编译器优化,请参阅优化注意事项