<?xml version="1.0" encoding="UTF-8"?>
<!-- Generated on Thu, 24 May 2012 17:30:49 -0700 -->
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <atom:link href="http://software.intel.com/en-us/articles/intel-mkl-kb/type/performance-and-optimization/feed/" rel="self" type="application/rss+xml" />
    <title>Intel Software Network articles Feed</title>
    <link>http://software.intel.com/en-us/articles/intel-mkl-kb/type/performance-and-optimization/</link>
    <description></description>
    <language>en-us</language>
    <item>
      <title>Intel(R) MKL and C++ template libraries</title>
      <description><![CDATA[ <div id="art_pre_template"><br /></div>
<div id="art_pre_template">
<div id="art_pre_template">Some Intel(R) MKL users indicated that it would be valuable to have C++ API to invoke MKL functionality.</div>
<div id="art_pre_template"><br /></div>
<div id="art_pre_template">There are a few existing open source C++ template libraries that can be linked with Intel(R) MKL. This allows using highly abstracted C++ classes to perform matrix/vector operations, linear algebra factorizations etc. achieving about the same performance as MKL library provides.</div>
<div id="art_pre_template"><br /></div>
<div id="art_pre_template">Please refer to the documentation placed on the web-pages of the C++ libraries. Feel free to choose the package that mostly fits your needs and/or C++ style requirements.</div>
<div><br /></div>
</div>
<div id="art_pre_template"><br /></div>
<div id="art_pre_template">
<table class="MsoTableGrid" border="1" cellspacing="0" cellpadding="0" >
<tbody>
<tr>
<td width="493" valign="top" >
<p class="MsoNormal" ><b><span lang="EN-US">C++ math library<o:p></o:p></span></b></p>
</td>
<td width="304" valign="top" >
<p class="MsoNormal" ><b><span lang="EN-US">Supported MKL functionality<o:p></o:p></span></b></p>
</td>
</tr>
<tr>
<td width="493" valign="top" >
<p class="MsoNormal" ><span lang="EN-US"><br /></span></p>
<p class="MsoNormal" ><span lang="EN-US">Eigen</span><o:p></o:p></p>
<p class="MsoNormal" ><a href="http://eigen.tuxfamily.org/dox-devel/TopicUsingIntelMKL.html">http://eigen.tuxfamily.org/dox-devel/TopicUsingIntelMKL.html</a><o:p></o:p></p>
</td>
<td width="304" valign="top" >
<p class="MsoNormal" ><span lang="EN-US"><br /></span></p>
<p class="MsoNormal" ><span lang="EN-US">BLAS (level 2, 3)<o:p></o:p></span></p>
<p class="MsoNormal" ><span lang="EN-US">LAPACK(LU, Cholesky,   QR, SVD, Eigvalues, Shur)<o:p></o:p></span></p>
<p class="MsoNormal" ><span lang="EN-US">VML<o:p></o:p></span></p>
<p class="MsoNormal" ><span lang="EN-US">PARDISO<o:p></o:p></span></p>
<p class="MsoNormal" ><span lang="EN-US"><br /></span></p>
</td>
</tr>
<tr>
<td width="493" valign="top" >
<p class="MsoNormal" ><span lang="EN-US"><br /></span></p>
<p class="MsoNormal" ><span lang="EN-US">Armadillo</span><o:p></o:p></p>
<p class="MsoNormal" ><a href="http://sourceforge.net/projects/arma/"><span lang="EN-US">http</span>://<span lang="EN-US">sourceforge</span>.<span lang="EN-US">net</span>/<span lang="EN-US">projects</span>/<span lang="EN-US">arma</span>/</a><o:p></o:p></p>
<p class="MsoNormal" > </p>
<p class="MsoNormal" ><o:p> </o:p></p>
</td>
<td width="304" valign="top" >
<p class="MsoNormal" ><span lang="EN-US"><br /></span></p>
<p class="MsoNormal" ><span lang="EN-US">BLAS (dot, gemv,   gemm)<o:p></o:p></span></p>
<p class="MsoNormal" ><span lang="EN-US">LAPACK (LU, Cholesky,   QR, SVD, Eigvalues)<o:p></o:p></span></p>
</td>
</tr>
<tr>
<td width="493" valign="top" >
<p class="MsoNormal" ><span lang="EN-US"><br /></span></p>
<p class="MsoNormal" ><span lang="EN-US">MTL4<o:p></o:p></span></p>
<p class="MsoNormal" ><a href="http://www.mtl4.org/"><span lang="EN-US">http://www.mtl4.org/</span></a><span lang="EN-US"><o:p></o:p></span></p>
<p class="MsoNormal" ><span lang="EN-US"><o:p> </o:p></span></p>
</td>
<td width="304" valign="top" >
<p class="MsoNormal" ><span lang="EN-US"><br /></span></p>
<p class="MsoNormal" ><span lang="EN-US">BLAS (gemm)<o:p></o:p></span></p>
<p class="MsoNormal" ><span lang="EN-US">LAPACK (LU)<o:p></o:p></span></p>
<p class="MsoNormal" ><span lang="EN-US"><br /></span></p>
</td>
</tr>
<tr>
<td width="493" valign="top" >
<p class="MsoNormal" ><span lang="EN-US"><br /></span></p>
<p class="MsoNormal" ><span lang="EN-US">BOOST uBLAS<o:p></o:p></span></p>
<p class="MsoNormal" ><a href="http://www.boost.org/doc/libs/1_35_0/libs/numeric/ublas/doc/index.htm"><span lang="EN-US">http://www.boost.org/doc/libs/1_35_0/libs/numeric/ublas/doc/index.htm</span></a><span lang="EN-US"><o:p></o:p></span></p>
<p class="MsoNormal" ><span lang="EN-US"><o:p> </o:p></span></p>
<p class="MsoNormal" ><span lang="EN-US">together with BOOST   numeric bindings<o:p></o:p></span></p>
<p class="MsoNormal" ><span lang="EN-US"><a href="http://mathema.tician.de/software/boost-bindings">http://mathema.tician.de//software/boost-bindings</a></span><span lang="EN-US"><o:p></o:p></span></p>
<p class="MsoNormal" > </p>
<p class="MsoNormal" ><span lang="EN-US"><o:p> </o:p></span></p>
</td>
<td width="304" valign="top" >
<p class="MsoNormal" ><span lang="EN-US"><br /></span></p>
<p class="MsoNormal" ><span lang="EN-US"><br /></span></p>
<p class="MsoNormal" ><span lang="EN-US">BLAS<o:p></o:p></span></p>
<p class="MsoNormal" ><span lang="EN-US">LAPACK<o:p></o:p></span></p>
</td>
</tr>
<tr>
<td width="493" valign="top" >
<p class="MsoNormal" ><span lang="EN-US"><br /></span></p>
<p class="MsoNormal" ><span lang="EN-US">Trilinos</span><o:p></o:p></p>
<p class="MsoNormal" ><a href="http://trilinos.sandia.gov/">http://trilinos.sandia.gov/</a><o:p></o:p></p>
<p class="MsoNormal" > </p>
</td>
<td width="304" valign="top" >
<p class="MsoNormal" ><span lang="EN-US"><br /></span></p>
<p class="MsoNormal" ><span lang="EN-US">BLAS<o:p></o:p></span></p>
<p class="MsoNormal" ><span lang="EN-US">LAPACK<o:p></o:p></span></p>
</td>
</tr>
</tbody>
</table>
</div>
<div id="art_pre_template"><br /></div>
<div id="art_pre_template"><br /></div>
<div id="art_pre_template"><br /></div>
<div id="art_pre_template">
<div id="art_pre_template"><b><i><span >Important note:</span></i></b></div>
<div id="art_pre_template"><br /></div>
<div id="art_pre_template">While the libraries above are known to be the products of high quality, Intel MKL cannot guarantee that all the calls to MKL functions are implemented 100% correctly there.  If you find an issue during incorporating the codes of those libraries into your program, please first report the issue to the owners of these libraries. However, if there is an issue caused by using Intel MKL, please report us by submitting an issue on <a href="http://software.intel.com/en-us/forums/intel-math-kernel-library/">Intel MKL forum</a> or <a href="https://premier.intel.com">Intel Premier Support</a>.</div>
<div><br /></div>
</div>
<div id="art_pre_template"><br /></div>
<div id="art_pre_template"><br /></div> ]]></description>
      <link>http://software.intel.com/en-us/articles/intelr-mkl-and-c-template-libraries/</link>
      <pubDate>Sun, 11 Mar 2012 13:00:00 -0700</pubDate>
      <comments>http://software.intel.com/en-us/articles/intelr-mkl-and-c-template-libraries/#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/intelr-mkl-and-c-template-libraries/</guid>
      <category>Intel® Math Kernel Library Knowledge Base</category>
    </item>
    <item>
      <title>Numpy/Scipy with Intel® MKL</title>
      <description><![CDATA[ <p><br /><strong>NumPy/SciPy Application Note<b></b><br /></strong><br /><b>Step 1 - Overview</b><br /><br />This guide is intended to help current NumPy/SciPy users to take advantage of Intel® Math Kernel Library (Intel® MKL). <br /><br /><strong>NumPy </strong>automatically maps operations on vectors and matrices to the BLAS and LAPACK functions wherever possible. Since Intel® MKL supports these de-facto interfaces, NumPy can benefit from Intel MKL optimizations through simple modifications to the NumPy scripts.<br /><br /><a href="http://en.wikipedia.org/wiki/NumPy">NumPy</a> is the fundamental package required for scientific computing with Python. It consists of:</p>
<ul type="disc">
<li>a powerful N-dimensional array object</li>
<li>sophisticated (broadcasting) functions</li>
<li>tools for integrating C/C++ and Fortran code</li>
<li>useful linear algebra, Fourier transform, and random number capabilities.</li>
</ul>
<p>Besides its obvious scientific uses, NumPy can also be used as an efficient multi-dimensional container of generic data.</p>
<p>For more information on NumPy, please visit http://NumPy.scipy.org/<br /><br /><strong>SciPy </strong>include modules for statistics, optimization, integration, linear algebra, Fourier transforms, signal and image processing, ODE solvers, and more.  The SciPy library depends on NumPy, which provides convenient and fast N-dimensional array manipulation. The SciPy library is built to work with NumPy arrays, and provides many user-friendly and efficient numerical routines such as routines for numerical integration and optimization for python users. Please refer <a href="http://scipy.org/">http://www.scipy.org</a> for more details on SciPy.<br /><br /><img height="410" width="637" src="http://software.intel.com/file/40584" alt="numpy-matrix-multiply.jpg" title="numpy-matrix-multiply.jpg" /><br /><br /><b>Version Information</b><br /><br />This application note was created to help NumPy/SciPy users to make use of the latest versions of Intel MKL on Linux platforms.</p>
<p>The instructions given in this articles apply to Intel MKL 10.3 and above and Intel Compiler 11.0 and above.<br /><br /><b>Step 2 - Downloading NumPy and SciPy Source Code</b><br /><br />The NumPy source code can be downloaded from:</p>
<p><a href="http://www.scipy.org/Download">http://www.scipy.org/Download</a><br /><br /><b>Prerequisites</b><br /><br />Intel MKL can be obtained from the following options:</p>
<p>Download a FREE evaluation version of the Intel MKL product.<br />Download the FREE non-commercial* version of the Intel MKL product.<br /><br />All of these can be obtained at: <a href="http://www.intel.com/cd/software/products/asmo-na/eng/307757.htm">Intel® Math Kernel Library product web page</a>.<br /><br />Intel® MKL is also bundled with the following products<br /><br /><a href="http://software.intel.com/en-us/articles/intel-parallel-studio-xe/">Intel® Parallel Studio XE 2011</a><br /><a href="http://software.intel.com/en-us/articles/intel-composer-xe/">Intel Composer XE 2011</a><br /><a href="http://software.intel.com/en-us/articles/intel-cluster-studio/">Intel Cluster Studio 2011</a><br /><br /><b>Step 3 - Configuration</b><br /><br />Use the following commands to <b>extract the NumPy tar files </b>from the downloaded NumPy-x.x.x.tar.gz.</p>
<pre name="code" class="shell">$gunzip numpy-x.x.x.tar.gz $tar -xvf numpy-x.x.x.tar </pre>
<p><br />The above will create a directory named numpy-x.x.x<br /><br />And to extract SciPy, use the below commands</p>
<pre name="code" class="shell">$gunzip scipy-x.x.x.tar.gz $tar -xvf scipy-x.x.x.tar.gz </pre>
<p><br />The scipy-x.x.x directory will be created with extracted files.<br /><br />Make sure that C++ and FORTRAN compilers are installed and they are in PATH. Also set LD_LIBRARY_PATH to your compiler (C++ and FORTRAN), and MKL libraries.<br /><br /><b>Step 4 - Building and Installing NumPy</b><br /><br />Change directory to numpy-x.x.x<br />Create a site.cfg from the existing one<br /><br />Edit site.cfg as follows:</p>
<p>Add the following lines to site.cfg in your top level NumPy directory to use Intel® MKL, if you are building on Intel 64 platform:</p>
<pre name="code" class="cpp">[mkl]<br />library_dirs = /opt/intel/composer_xe_2011_sp1.6.233/mkl/lib/intel64<br />include_dirs = /opt/intel/composer_xe_2011.sp1.6.233/mkl/include<br />mkl_libs = mkl_rt<br />lapack_libs =</pre>
<p><br />If you are building NumPy for 32 bit, please add as the following</p>
<pre name="code" class="cpp">[mkl] library_dirs = /opt/intel/composer_xe_2011_sp1.6.233/mkl/lib/ia32<br />include_dirs = /opt/intel/composer_xe_2011_sp1.6.233/mkl/include mkl_libs = mkl_rt<br />lapack_libs = </pre>
<p>Modify cc_exe in numpy/distutils/intelccompiler.py to be something like:</p>
<pre name="code" class="cpp">self.cc_exe = 'icc -O3 -g -fPIC -fp-model strict -fomit-frame-pointer -openmp -xhost' </pre>
<p>Here we use, -O3, optimizations for speed and enables more aggressive loop transformations such as Fusion, Block-Unroll-and-Jam, and collapsing IF statements, -openmp for OpenMP threading and -xhost option tells the compiler to generate instructions for the highest instruction set available on the compilation host processor. If you are using the ILP64 interface, please add -DMKL_ILP64 compiler flag.</p>
<p>Run icc --help for more information on processor-specific options, and refer Intel Compiler documentation for more details on the various compiler flags.</p>
<p>Modify the the Fortran compiler configuration in numpy-x.x.x/numpy/distutil/fcompiler/intel.py to use the following compiler options for the Intel Fortran Compiler:<br /><br />For ia32 and Intel64</p>
<pre name="code" class="python">ifort -xhost -openmp -fp-model strict -fPIC
</pre>
<p><br />If you are using ILP64 interface of Intel MKL, please add -i8 flag above.  Please download the modified <a href="http://software.intel.comjavascript:void(0)" onclick="ndownload('http://software.intel.com/file/42863')"><strong>intel.py</strong></a> for your reference, which can be replaced to use the above mentioned compiler options.<br /><br />Compile and install NumPy with the Intel compiler: (on 64-bit platforms replace "intel" with "intelem")<b></b></p>
<pre name="code" class="cpp">python setup.py config --compiler=intel build_clib --compiler=intel build_ext --compiler=intel install </pre>
<p><br /><strong>Build and Install SciPy<br /><br /></strong>Compile and install SciPy with the Intel Compilers: (On 64-bit platforms replace "intel" with "intelem")</p>
<pre name="code" class="python">$python setup.py config --compiler=intel --fcompiler=intel build_clib --compiler=intel --fcompiler=intel build_ext --compiler=intel --fcompiler=intel install</pre>
<p><br /><strong>Setup Library path for Intel MKL and Intel Compilers<br /></strong><br />If you build NumPY/SciPy for Intel64 bit platforms:</p>
<pre name="code" class="shell">$export LD_LIBRARY_PATH=/opt/intel/composer_xe_2011_sp1.6.233/mkl/lib/intel64:/opt/intel/composer_xe_2011_sp1.6.233/compiler/lib/intel64:$LD_LIBRARY_PATH </pre>
<p>If you build NumPY for ia32 bit platforms:</p>
<pre name="code" class="shell">$export LD_LIBRARY_PATH=/opt/intel/composer_xe_2011_sp1.6.233/mkl/lib/ia32:/opt/intel/composer_xe_2011_sp1.6.233/compiler/lib/ia32:$LD_LIBRARY_PATH </pre>
<p>It is possible that LD_LIBRARY_PATH causes a problem, if you have installed Intel MKL and Intel Composer XE in other directories than the standard ones. The only solution I've found that always works is to build Python, NumPy and SciPy inside an environment where you've set the LD_RUN_PATH variable, e.g:</p>
<pre name="code" class="cpp">$export LD_RUN_PATH=~/opt/lib:~/intel/composer_xe_2011_sp1.6.233/compiler/lib:~/intel/composer_xe_2011_sp1.6.233/mkl/lib/ia32
</pre>
<p><br /><b>Note:</b>We recommend users to use arrays with 'C' ordering style which is row-major, which is default than Fortran Style which is column-major, and this is because NumPy uses CBLAS and also to get better performance.<br /><br /><b>Appendex A: Example:</b> <br /><br />Please see below an example Python script for matrix multiplication that you can use Numply installed with Intel MKL which has been provided for illustration purpose.</p>
<pre name="code" class="python">import numpy as np  
import time  
  
N = 6000  
M = 10000  
  
k_list = [64, 80, 96, 104, 112, 120, 128, 144, 160, 176, 192, 200, 208, 224, 240, 256, 384]  
  
def get_gflops(M, N, K):  
    return M*N*(2.0*K-1.0) / 1000**3  
  
np.show_config()  
  
for K in k_list:  
    a = np.array(np.random.random((M, N)), dtype=np.double, order='C', copy=False)  
    b = np.array(np.random.random((N, K)), dtype=np.double, order='C', copy=False)  
    A = np.matrix(a, dtype=np.double, copy=False)  
    B = np.matrix(b, dtype=np.double, copy=False)  
  
    C = A*B  
  
    start = time.time()  
  
    C = A*B  
    C = A*B  
    C = A*B  
    C = A*B  
    C = A*B  
  
    end = time.time()  
  
    tm = (end-start) / 5.0  
  
    print "{0:4}, {1:9.7}, {2:9.7}".format(K, tm, get_gflops(M, N, K) / tm) </pre>
<p> <br /><b>Appendix B: Performance Comparison<br /><br /></b><br /><br /><img height="593" width="567" src="http://software.intel.com/file/41780" alt="numpy_mkl_svd_comparison.jpg" title="numpy_mkl_svd_comparison.jpg" /><br /><br /><img height="532" width="580" src="http://software.intel.com/file/41779" alt="numpy_mkl_lu_comparison.jpg" title="numpy_mkl_lu_comparison.jpg" /> <br /><br /><img height="504" width="561" src="http://software.intel.com/file/41778" alt="numpy_mkl_cholesky_comparison.jpg" title="numpy_mkl_cholesky_comparison.jpg" /><br /><br />Please click <a href="http://software.intel.comjavascript:void(0)" onclick="ndownload('http://software.intel.com/file/41177')"><b>Examples.py</b></a> to download the examples for LU, Cholesky and SVD.<br /><br /><br /><br /><b></b><a></a></p> ]]></description>
      <link>http://software.intel.com/en-us/articles/numpy-scipy-with-mkl/</link>
      <pubDate>Sun, 11 Mar 2012 11:30:00 -0700</pubDate>
      <comments>http://software.intel.com/en-us/articles/numpy-scipy-with-mkl/#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/numpy-scipy-with-mkl/</guid>
      <category>Intel® Math Kernel Library Knowledge Base</category>
    </item>
    <item>
      <title>Introduction to the New Functions Providing Bitwise Reproducibility</title>
      <description><![CDATA[ <p>Intel® MKL 11.0 introduces functions to help users obtain conditional bitwise reproducible (CBWR) floating-point results when calling library functions from their application.  When using these new features, Intel MKL functions are designed to return the same floating-point results from run-to-run, subject to the following limitations:</p>
<ul>
<li>calls to Intel® MKL occur in a single executable</li>
<li>input and output arrays in function calls must be aligned on 16, 32, or 64 byte boundaries on systems with SSE / AVX1 / AVX2 instructions support (resp.) </li>
<li>the number of computational threads used by the library remains constant throughout the run </li>
</ul>
<p>It is well known that for general single and double precision IEEE floating-point numbers, the associative property does not always hold, meaning (a+b)+c may not equal a +(b+c).  Let's consider a specific example. In infinite precision arithmetic 2<sup>-63</sup> + 1  + -1 = 2<sup>-63</sup>. If instead we do this same computation on a computer use double precision floating-point numbers, rounding error is introduced and we clearly see why order of operations becomes important:</p>
<p align="center">(2<sup>-63</sup> + 1) + (-1) ≈ 1 + (-1) = 0</p>
<p>versus</p>
<p align="center">2<sup>-63</sup> + (1 + (-1)) ≈ 2<sup>-63</sup> + 0 = 2<sup>-63</sup></p>
<p>This inconsistency in results due to order-of-operations is precisely what the new functions are designed to address.</p>
<p>The application related factors that affect the order of floating-point operations within a single executable program include code-path selection based on run-time processor dispatching, data array alignment, variation in number of threads, threaded algorithms and internal floating-point control settings. Most of these factors can be controlled by the user by properly controlling the number of threads, floating point settings and taking steps to align memory when it is allocated (see this <a href="http://software.intel.com/en-us/articles/getting-reproducible-results-with-intel-mkl/">previous article on getting reproducible results</a>). On the other hand run-time dispatching and certain threaded algorithms have not allowed users to make changes that can ensure the same order of operations from run to run.</p>
<p>Intel MKL does run-time processor dispatching in order to identify the appropriate internal code paths to traverse for the Intel MKL functions called by the application. The code paths chosen may differ across a wide range of Intel processors and IA compatible processors and may provide differing levels of performance. For example, an Intel MKL function running on an Intel® Pentium® 4 processor may run an SSE2-based code path, while on a more recent Intel® Xeon® processor supporting Intel® Advanced Vector Extensions (AVX), that same library function may dispatch to a different code-path that uses these AVX instructions. This is because each unique code path has been optimized to match the features available on the underlying processor. The feature-based approach introduces a challenge: if any of the internal floating-point operations are done in a different order, or are re-associated, then the computed results may differ.</p>
<p>Dispatching optimized code-paths based on the capabilities of the processor on which it is running is central to the optimization approach used by Intel MKL so it is natural that there should be some performance trade-offs when requiring consistent results. If limited to a particular code-path, Intel MKL performance can in some circumstances degrade by more than half. This can be easily understood by noting that matrix-multiply performance nearly doubled with the introduction of new processors supporting AVX instructions. In other cases, performance may degrade by 10-20% if algorithms are restricted to maintain the order of operations.</p>
<p>Intel® MKL 11.0 includes new functions and environment variables, shown in figures 1, 2, and 3  designed to help users get bitwise reproducible results from the Intel MKL functions used.  To better understand how to use these features, some usage examples are provided below. Only the MKL_CBWR_AUTO and MKL_CBWR_COMPATIBLE options are supported on non-Intel CPUs.</p>
<p>To ensure MKL calls return the same results on <b>all Intel or Intel compatible CPUs supporting SSE2 instructions or later </b>make sure your application uses a fixed number of threads, in/output arrays in Intel MKL function calls are aligned properly, and</p>
<p ><code>call mkl_cbwr_set(MKL_CBWR_COMPATIBLE)</code></p>
<p>or set the environment variable</p>
<p ><code>MKL_CBWR_BRANCH = "COMPATIBLE"</code></p>
<p>Note: the special MKL_CBWR_COMPATIBLE option is provided because Intel and Intel compatible CPUs have approximation instructions (e.g., rcpps/rsqrtps) that may return different results. This option ensures that Intel MKL uses an SSE2 only code-path which does not use these instructions.</p>
<p> </p>
<p>To ensure MKL calls return the same results on <b>every Intel CPU that supports SSE2 instructions or later</b> make sure your application uses a fixed number of threads, in/output arrays are aligned properly, and call</p>
<p ><code>mkl_cbwr_set(MKL_CBWR_SSE2) </code></p>
<p>or set the environment variable</p>
<p ><code>MKL_CBWR_BRANCH = "SSE2"</code></p>
<p><i>Note: on non-Intel CPUs the results may differ because the MKL_CBWR_COMPATIBLE is run instead.</i></p>
<p> </p>
<p>To ensure MKL calls return the same results on <b>every Intel CPU that supports SSE4.1 instructions or later</b> make sure your application uses a fixed number of threads, in/output arrays are aligned properly, and call</p>
<p ><code>mkl_cbwr_set(MKL_CBWR_SSE4_1)</code></p>
<p>or set the environment variable</p>
<p ><code>MKL_CBWR_BRANCH = "SSE4_1"</code></p>
<p><i>Note: on non-Intel CPUs the results may differ because the MKL_CBWR_COMPATIBLE is run instead.</i></p>
<p> </p>
<p>Ensure MKL calls return the same results on <b>every Intel CPU that supports AVX instructions or later</b> make sure your application uses a fixed number of threads, in/output arrays are aligned properly, and call</p>
<p ><code>mkl_cbwr_set(MKL_CBWR_AVX)</code></p>
<p>or set the environment variable</p>
<p ><code>MKL_CBWR_BRANCH = "AVX"</code></p>
<p>Note: on non-Intel CPUs the results may differ because the MKL_CBWR_COMPATIBLE code-path is run instead. On an  Intel CPU without AVX support, the MKL_CBWR_DEFAULT path is run instead.</p>
<p> </p>
<p>Please consult the user guide for additional details.</p> ]]></description>
      <link>http://software.intel.com/en-us/articles/intro-to-CBWR-in-intel-mkl/</link>
      <pubDate>Tue, 21 Feb 2012 00:00:00 -0800</pubDate>
      <comments>http://software.intel.com/en-us/articles/intro-to-CBWR-in-intel-mkl/#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/intro-to-CBWR-in-intel-mkl/</guid>
      <category>Intel® Math Kernel Library Knowledge Base</category>
    </item>
    <item>
      <title>Questions and Answers for the Intel® Math Kernel Library Webinar on November 11, 2010</title>
      <description><![CDATA[ <p>If you missed our webinar "Get Ready for Intel® Math Kernel Library 10.3 — A Component of Intel® Composer XE 2011" presented on Nov 11, 2010, please download a <a href="http://software.intel.com/file/32140">recording of the webinar</a> as well as a <a href="http://software.intel.com/file/32143">PDF file of the slides</a>. Below are listed some of the questions and answers that were brought up during the presentation.</p>
<p><b><br /></b></p>
<p><b>Questions on the new releases, version numbers, and upgrade</b></p>
<p><b>Q</b>: what's the latest mkl version? we have Intel compiler 11.1 which includes mkl. Does this mean the mkl version is 11.1?<br /><b>A</b>: The version of Intel MKL that is included in the Intel Compiler 11.1 is Intel MKL 10.2 (and updates of this compiler contain updates of Intel MKL). If you’d like to know which version of Intel MKL you’re using, you can check the mklsupport* file in the doc or Documentation directory. Another place to look is one of the following knowledgebase articles:</p>
<ul>
<li><a href="http://software.intel.com/en-us/articles/which-version-of-ipp--mkl--tbb-is-installed-with-intel-compiler-professional-edition/">Which version of Intel IPP, Intel MKL and Intel TBB is installed by the Intel® Compiler Professional Edition?</a> </li>
<li><a href="http://software.intel.com/en-us/articles/which-version-of-the-intel-ipp-intel-mkl-and-intel-tbb-libraries-are-included-in-the-intel-composer-bundles/">Which version of the Intel® IPP, Intel® MKL and Intel® TBB Libraries are Included in the Intel® Composer Bundles?</a> </li>
</ul>
<p><b>Q</b>: You mentioned that Intel Composer XE contains the C++ compiler. Why is there a separate C++ Composer XE product then?<br /><b>A</b>: There is an Intel® C++ Composer XE 2011 as well as an Intel® Fortran Composer XE 2011. These are available for those that need only one of the two compilers in Intel® Composer XE 2011. See the <a href="http://software.intel.com/en-us/articles/buy-or-renew/">buy or renew products page</a> for a full list.</p>
<p><b>Q</b>: Intel® Parallel Studio XE 2011 - your first slide did not mention Mac OS* X. Why?<br /><b>A</b>: Intel® Parallel Studio XE 2011 is a suite of tools some of which are not available for Mac OS* X. The compiler and library support continues through the Intel® C++ Composer XE 2001 and Intel® Fortran Composer XE 2001 for Mac OS* X products. See: <a href="http://software.intel.com/en-us/articles/intel-sdp-products/" title="http://software.intel.com/en-us/articles/intel-sdp-products/">http://software.intel.com/en-us/articles/intel-sdp-products/</a>.</p>
<p><b>Q</b>: I recently purchased Intel Fortran Compiler 11.1, is the new Intel MKL 10.3 included as an upgrade?<br /><b>A</b>: Yes. Anyone with a current (unexpired) license for Intel MKL or professional editions of the Intel Compilers can obtain the new version of those tools (including Intel MKL 10.3).</p>
<!--
<p><b>Q</b>: Is VTune going to be Windows 7 compatible? Or AmplifierXE is the profiler for Windows going forward?<br /><b>A</b>: Will send to Dave Mackay and team.</p>
-->
<p><b><br /></b></p>
<!--      ************       -->
<p><b>Performance</b><br /><br /><b>Q</b>: In which areas does ATLAS beat MKL?<br /><b>A</b>: We do regular performance benchmarking against ATLAS. We are not aware of places where ATLAS has better performance. If you know of any, please let us know.</p>
<p><b>Q</b>: Is the hybrid improvement (MPI+OpenMP*) mainly for 3D FFTs?<br /><b>A</b>: Earlier we had implemented hybrid parallelism for 3D FFTs and now we have introduced it (MPI + OpenMP*) on cluster 1D complex transforms too. Most of the improvement is for vector lengths which are a multiple of the number of MPI processes</p>
<!--
<p><b>Q</b>: You have performance comparisions between all MPI vs mixed MPI-OpenMP?<br /><b>A</b>: We see pure MPI works better on cluster system, mixed MPI-OpenMP works better for large SMP system, i.e. NHM-ex 4 socket x 8 cores (total of 32 cores) system</p>
<p><b>Q</b>: In that case, how much better is MPI-OpenMP compared to OpenMP alone?<br /><b>A</b>:</p>
-->
<p><b>Q</b>: Do you have a performance comparison with 64-bit GotoBLAS? I observed that GotoBLAS is faster in some case but a bit unstable.<br /><b>A</b>: No we don't have any performance comparisons.</p>
<p><b>Q</b>: You talked about the MKL 10.3 performance on the new 6 core system. What about the performance on the existing Nehelem system?<br /><b>A</b>: You can find more performance information on the <a href="http://software.intel.com/en-us/articles/intel-mkl/">Intel MKL site</a> under the 'resources' tab.</p>
<p><b>Q</b>: I am testing PARDISO in my laptop and comparing with DSS sparse solver. They solve the matrix in the same amount of time. Am I doing something wrong?<br /><b>A</b>: This would be expected. DSS sits on top of Pardiso. The main value of DSS is that it provides a simplified interface.</p>
<!--
<p><b>Q</b>: Have you done any MKL benchmarking on AMD architecture?<br /><b>A</b>: Yes. It’s our goal to be the best performing math library available on all systems with Intel architecture.</p>
--><!--      ************       --><br />
<p><b>Questions on Intel MKL features</b></p>
<p><b>Q</b>: C interface to LAPACK - what kind of overhead with using it? I have been okay with the current usage, confusing but okay. Should I move over?<br /><b>A</b>: New kernels are steadily being introduced to eliminate the performance and memory overhead of transposition required for row-major data structures. Let us know if you have questions about a particular LAPACK function in a particular version of Intel MKL.</p>
<p><b>Q</b>: Random number generator - should I expect to see differences in numbers generated (everything else constant including CPU) in a 32-bit build and a 64-bit build?<br /><b>A</b>: There are some cases where random generators differ between 32- and 64-bit builds due to different accuracy level of the VML functions used. <i>More info to follow...</i></p>
<p><b>Q</b>: Can you explain disadvantages of linking the OpenMP run-time library statically?<br /><b>A</b>: If your application or plug-in will be used in an environment where any other application or plug-in might also be threaded using OpenMP* then the OpenMP run-time may stall when it finds another statically linked run-time library already initialized.<br /><!--      ************       --><br /><b>Usage and tips</b></p>
<p><b>Q</b>: Are there any common mistakes that new users make that frequently lead to undefined behaviour?<br /><b>A</b>: There are many ways that we see users make mistakes, but they are not so easy to enumerate. Here are some general categories that the problems can fall into: linking problems (use the <a href="http://software.intel.com/en-us/articles/intel-mkl-link-line-advisor/">link line advisor</a> or <a href="http://software.intel.com/en-us/articles/intel-math-kernel-library-documentation/">user’s guide</a>), improper data layout (e.g., for FFTs or PARDISO—see the r<a href="http://software.intel.com/en-us/articles/intel-math-kernel-library-documentation/">eference manual</a>), or simple API misunderstandings (see the <a href="http://software.intel.com/en-us/articles/intel-math-kernel-library-documentation/">reference manual</a>)</p>
<p><b>Q</b>: Is the icc -mkl=[parallel,sequential] link method recommended?<br /><b>A</b>: Yes! Take a look at the "<a href="http://software.intel.com/en-us/articles/using-mkl-in-intel-compiler-mkl-qmkl-options/">Using MKL in Intel® Compiler - mkl, Qmkl options</a>" article for more information.</p>
<p><b>Q</b>: Is there some support for Eclipse IDE?<br /><b>A</b>: There is some integration of our documentation into the IDE as well as documentation in the product on how to setup Eclipse for use of Intel MKL. Take a look at “Programming with Intel® Math Kernel Library in the Eclipse* Integrated Development Environment (IDE)” section of the <a href="http://software.intel.com/en-us/articles/intel-math-kernel-library-documentation/">User's Guide for Linux*</a>. <br /><br /><b>Plans for the future</b></p>
<p><b>Q</b>: What new library functionality can users expect to appear in Intel MKL in coming years?<br /><b>A</b>: A package of Eigensolvers, a cluster version of PARDISO, extensions to VSL, etc.</p>
<p><b>Q</b>: In the roadmap, do you have any implementations on GPU?<br /><b>A</b>: We have no current plans to extend GPU support.</p>
<!--
<p><b>Q</b>: Any word re C++ interface to MKL? E.g. going from Boost(uBLAS) to MKL? <br /><b>A</b>:</p>
--> ]]></description>
      <link>http://software.intel.com/en-us/articles/questions-and-answers-for-the-intel-math-kernel-library-webinar-on-november-11-2010/</link>
      <pubDate>Sun, 14 Nov 2010 21:00:00 -0800</pubDate>
      <comments>http://software.intel.com/en-us/articles/questions-and-answers-for-the-intel-math-kernel-library-webinar-on-november-11-2010/#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/questions-and-answers-for-the-intel-math-kernel-library-webinar-on-november-11-2010/</guid>
      <category>Intel® Math Kernel Library Knowledge Base</category>
    </item>
    <item>
      <title>New fast basic random number generator SFMT19937 in Intel MKL</title>
      <description><![CDATA[ <br /><br />Intel MKL 10.3 introduced a new basic generators: a SIMD friendly Fast Mersenne Twister pseudorandom number <strong>SFMT19937</strong> generator.<br /><br /><strong>SFMT19937</strong> is analogous to Mersenne Twister (MT) basic generators. But it can take the advantage of SIMD instructions and provide the fast implementation in the processors. <br /><br /><br />To learn more information on SFMT algorithm, please check the bellow article.<br /><br /><em>Saito, M., and Matsumoto, M. SIMD-oriented Fast Mersenne Twister: a 128-bit Pseudorandom Number Generator. Monte Carlo and Quasi-Monte Carlo Methods 2006, Springer, Pages 607 – 622, 2008.<br /></em><a href="http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/ARTICLES/earticles.html"><em>http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/ARTICLES/earticles.html</em></a><br /><br /><br />The following is an example application using Intel MKL SFMT19937<br /><br /><br />
<pre name="code" class="cpp">#include &lt;stdio.h&gt;
#include “mkl_vsl.h”
 
int main()
{
   double r[1000]; /* buffer for random numbers */
   double s; /* average */
   VSLStreamStatePtr stream;
   int i, j;
    
   /* Initializing */        
   s = 0.0;
   vslNewStream( &amp;stream, VSL_BRNG_SFMT19937, 777 );
    
   /* Generating */        
   for ( i=0; i&lt;10; i++ );
   {
      vdRngGaussian( VSL_RNG_METHOD_GAUSSIAN_ICDF, stream, 1000, r, 5.0, 2.0 );
      for ( j=0; j&lt;1000; j++ );
      {
         s += r[j];
      }
   }
   s /= 10000.0;
    
   /* Deleting the stream */        
   vslDeleteStream( &amp;stream );
    
   /* Printing results */        
   printf( “Sample mean of normal distribution = %f\n”, s );
    
   return 0;
}<br /><br /><br />
</pre>
<br /><br /> ]]></description>
      <link>http://software.intel.com/en-us/articles/new-fast-basic-random-number-generator-sfmt19937-in-intel-mkl/</link>
      <pubDate>Sat, 06 Nov 2010 11:30:00 -0700</pubDate>
      <comments>http://software.intel.com/en-us/articles/new-fast-basic-random-number-generator-sfmt19937-in-intel-mkl/#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/new-fast-basic-random-number-generator-sfmt19937-in-intel-mkl/</guid>
      <category>Intel® C++ Compiler for Linux* Knowledge Base</category>
      <category>Intel® C++ Compiler for Mac OS X* Knowledge Base</category>
      <category>Intel® C++ Compiler for Windows* Knowledge Base</category>
      <category>Intel® Cluster Toolkit for Linux* Knowledge Base</category>
      <category>Intel® Cluster Toolkit for Windows* Knowledge Base</category>
      <category>Intel® Fortran Compiler for Linux* Knowledge Base</category>
      <category>Intel® Fortran Compiler for Mac OS X* Knowledge Base</category>
      <category>Intel® Math Kernel Library Knowledge Base</category>
    </item>
    <item>
      <title>Denormal paths speedup in VML by setting FTZ/DAZ setting</title>
      <description><![CDATA[ <p>Intel® MKL VML accuracy setting mode variable is extended with a new setting from Intel MKL 10.3 onwards.</p>
<p>Users can turn ON or OFF this setting by using VML_FTZDAZ_ON / VML_FTZDAZ_OFF (default) in VML functions.</p>
<p>VML_FTZDAZ_ON mode improves performance of computations that involve denormalized numbers at the cost of reasonable accuracy loss.</p>
<p>Enabling this mode changes numerical behavior of the functions:  denormalized input values may be treated as zeros and denormalized results may flush to zero.  Accuracy loss may occur if input and/or output values are close to denormal range.</p>
<p>Usage example:</p>
<p>vmlSetMode( VML_LA | VML_FTZDAZ_ON);</p>
<p>vmdExp(1000, a, r, VML_LA | VML_FTZDAZ_ON);</p>
<br /><br /><br />
<p>
<table cellpadding="5" cellspacing="0" rules="none" border="1">
<tbody>
<tr>
<th align="left" valign="middle" >Optimization Notice</th>
</tr>
<tr bgcolor="#ccecff">
<td>
<p>Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.</p>
<p align="right">Notice revision #20110804</p>
</td>
</tr>
</tbody>
</table>
 ]]></description>
      <link>http://software.intel.com/en-us/articles/denormal-paths-speedup-in-vml-by-setting-ftzdaz-setting/</link>
      <pubDate>Sat, 06 Nov 2010 11:30:00 -0700</pubDate>
      <comments>http://software.intel.com/en-us/articles/denormal-paths-speedup-in-vml-by-setting-ftzdaz-setting/#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/denormal-paths-speedup-in-vml-by-setting-ftzdaz-setting/</guid>
      <category>Intel® C++ Compiler for Linux* Knowledge Base</category>
      <category>Intel® C++ Compiler for Mac OS X* Knowledge Base</category>
      <category>Intel® C++ Compiler for Windows* Knowledge Base</category>
      <category>Intel® Fortran Compiler for Linux* Knowledge Base</category>
      <category>Intel® Fortran Compiler for Mac OS X* Knowledge Base</category>
      <category>Intel® Math Kernel Library Knowledge Base</category>
      <category>Intel® Visual Fortran Compiler for Windows* Knowledge Base</category>
    </item>
    <item>
      <title>Intel® AVX optimization in Intel® MKL</title>
      <description><![CDATA[ Intel ® AVX (Intel ® Advanced Vector Extensions) is the next step in the evolution of Intel processors. Intel® MKL had Intel® AVX optimization since Intel MKL 10.2, however to activate Intel AVX code in version 10.2, users needed to use mkl_enable_instructions(). Starting from Intel MKL 10.3, the Intel AVX code will be dispatched automatically and does not require special activation. In Intel MKL 10.3, Intel AVX optimization has been extended to DGEMM/SGEMM, radix-2 Complex-to-Complex FFT, most of real VML functions and VSL distribution generators.<br /><br />The special cases illustrating speed-ups can be achieved on Intel AVX-enabled processors running an Intel AVX-enabled operating systems over Intel® Xeon® Processor 6000 and 7000 Sequence (Server) in Intel MKL 10.3 are as following:<br /><br />Intel AVX DGEMM (M, N, K=8Kx4Kx128) performs 1.8x over Intel® Xeon® Processor 6000 and 7000 Sequence (Server). <br /><br />Intel AVX DGEMM/SGEMM achieves 88-90% machine peak.<br /><br />The Intel AVX/NHM speedup is 1.8x for radix-2 1D cluster FFTs  with N=1024<br /><br />The Intel® Optimized LINPACK benchmark, using Intel AVX optimizations, performs over 1.86x (or over 80% overall efficiency) on 4 cores with N=20000.<br /><br /><br />
<table cellpadding="5" cellspacing="0" rules="none" border="1">
<tbody>
<tr>
<th align="left" valign="middle" >Optimization Notice</th>
</tr>
<tr bgcolor="#ccecff">
<td>
<p>Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.</p>
<p align="right">Notice revision #20110804</p>
</td>
</tr>
</tbody>
</table> ]]></description>
      <link>http://software.intel.com/en-us/articles/intel-avx-optimization-in-intel-mkl-v103/</link>
      <pubDate>Wed, 03 Nov 2010 11:30:00 -0700</pubDate>
      <comments>http://software.intel.com/en-us/articles/intel-avx-optimization-in-intel-mkl-v103/#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/intel-avx-optimization-in-intel-mkl-v103/</guid>
      <category>Intel® C++ Compiler for Linux* Knowledge Base</category>
      <category>Intel® C++ Compiler for Mac OS X* Knowledge Base</category>
      <category>Intel® C++ Compiler for Windows* Knowledge Base</category>
      <category>Intel® Cluster Toolkit for Linux* Knowledge Base</category>
      <category>Intel® Cluster Toolkit for Windows* Knowledge Base</category>
      <category>Intel® Fortran Compiler for Linux* Knowledge Base</category>
      <category>Intel® Fortran Compiler for Mac OS X* Knowledge Base</category>
      <category>Intel® Math Kernel Library Knowledge Base</category>
    </item>
    <item>
      <title>Information about the FTC Decision and Order on the Intel® Compilers Reimbursement Fund</title>
      <description><![CDATA[ Information on the Intel Compiler Reimbursement Fund referenced in Section VII.D of the FTC Decision and Order is available now. Please see the site, <a href="http://www.CompilerReimbursementProgram.com">www.CompilerReimbursementProgram.com</a>, for further information. ]]></description>
      <link>http://software.intel.com/en-us/articles/information-about-the-ftc-decision-and-order-on-the-intel-compilers-reimbursement-fund/</link>
      <pubDate>Mon, 01 Nov 2010 00:00:00 -0700</pubDate>
      <comments>http://software.intel.com/en-us/articles/information-about-the-ftc-decision-and-order-on-the-intel-compilers-reimbursement-fund/#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/information-about-the-ftc-decision-and-order-on-the-intel-compilers-reimbursement-fund/</guid>
      <category>Intel® C++ Compiler for Linux* Knowledge Base</category>
      <category>Intel® C++ Compiler for Mac OS X* Knowledge Base</category>
      <category>Intel® C++ Compiler for Windows* Knowledge Base</category>
      <category>Intel® Software Development Tool Suites for Intel® Atom™ Processor Knowledge Base</category>
      <category>Intel® Fortran Compiler for Linux* Knowledge Base</category>
      <category>Intel® Fortran Compiler for Mac OS X* Knowledge Base</category>
      <category>Intel® Integrated Performance Primitives Knowledge Base</category>
      <category>Intel® Math Kernel Library Knowledge Base</category>
      <category>Intel® Parallel Composer Knowledge Base</category>
      <category>Intel® Visual Fortran Compiler for Windows* Knowledge Base</category>
    </item>
    <item>
      <title>MKL performance degradation on SGI ALtix UV system with Nehalem EX processor</title>
      <description><![CDATA[ <br />
<div id="art_pre_template"><b>Reference Number : DPD200155507</b><br /><br /><br /><b>Version : Intel® MKL 10.2.Update5 and earlier</b><br /><br /><br /><b>Product : </b><span ><b>Intel® Math Kernel Library (Intel® MKL)</b></span></div>
<div id="art_pre_template"><br /><br /><b>Operating System : </b><br />Red Hat Enterprise Linux* 5 <br />SuSE Linux Enterprise Server* 10<br /><br /><br /><b>Problem Description : </b><br />MKL experiences the performance degradation in the case dgemm when MKL works on<br />on SGI Altix systems with Nehalem-EX CPU.<br />The cause of the problem is that MKL wrong detects number of threads which are available on this type of system<br /><br /><br /><b>Resolution Status : </b><br />The problem has been fixed and is available in the versions of Intel® MKL 10.2 Update 6 and later versions.<br /><br /><br /><i>[DISCLAIMER: The information on this web site is intended for hardware system manufacturers and software developers. Intel does not warrant the accuracy, completeness or utility of any information on this site. Intel may make changes to the information or the site at any time without notice. Intel makes no commitment to update the information at this site. ALL INFORMATION PROVIDED ON THIS WEBSITE IS PROVIDED "as is" without any express, implied, or statutory warranty of any kind including but not limited to warranties of merchantability, non-infringement of intellectual property, or fitness for any particular purpose. Independent companies manufacture the third-party products that are mentioned on this site. Intel is not responsible for the quality or performance of third-party products and makes no representation or warranty regarding such products. The third-party supplier remains solely responsible for the design, manufacture, sale and functionality of its products. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. *Other names and brands may be claimed as the property of others.]<br /><br /><br /><br /></i></div> ]]></description>
      <link>http://software.intel.com/en-us/articles/mkl-performance-degradation-on-sgi-altix-uv-system-with-nehalem-ex-processor/</link>
      <pubDate>Sun, 18 Jul 2010 11:30:00 -0700</pubDate>
      <comments>http://software.intel.com/en-us/articles/mkl-performance-degradation-on-sgi-altix-uv-system-with-nehalem-ex-processor/#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/mkl-performance-degradation-on-sgi-altix-uv-system-with-nehalem-ex-processor/</guid>
      <category>Intel® C++ Compiler for Linux* Knowledge Base</category>
      <category>Intel® Cluster Toolkit for Linux* Knowledge Base</category>
      <category>Intel® Fortran Compiler for Linux* Knowledge Base</category>
      <category>Intel® Math Kernel Library Knowledge Base</category>
    </item>
    <item>
      <title>Intel® Math Kernel Library 10.3 Beta</title>
      <description><![CDATA[ <p><img height="135" width="747" src="http://software.intel.com/file/28203" /></p>
<p>Click here to <a target="_blank" href="https://registrationcenter.intel.com/RegCenter/BetaForm.aspx?ProductID=1459"><span ><b>register and download</b></span></a><br /><br />Intel MKL 10.3 beta is available now, your feedback is critical to the success of our product. We are especially interested in your feedback on the key features listed below. We also encourage you to send us feedback about our beta programs, web site and support services.</p>
<p><b>New Key features:</b></p>
<p>•  <b><a href="http://software.intel.com/en-us/articles/intel-avx-optimization-in-mkl-v103-beta/">Intel<sup>®</sup> Advanced Vector Extensions (Intel AVX) optimization</a></b></p>
<p>Intel AVX is the next step in the evolution of Intel processors. Intel AVX optimization has been extended to more MKL functions to get better performance on future Intel architecture.</p>
<p>•  <b><a href="http://software.intel.com/en-us/articles/statistical-summary-library-overview/">Summary Statistics library</a></b></p>
<p>An optimized parallel library that uses recent advances of statistics by providing modern algorithms that enhance accuracy and performance of statistical computations.</p>
<p>•  <b>Extended MKL C language support<i><br /></i></b><br />o    <a href="http://software.intel.com/en-us/articles/c-interface-for-lapack/">C interface to LAPACK</a><br />o    <a href="http://software.intel.com/en-us/articles/c-style-0-based-index-arrays-in-pardiso/">C style 0-based index arrays in PARDISO</a></p>
<p>•  <b><a href="http://software.intel.com/en-us/articles/using-the-intel-mkl-dynamic-interface-for-windows/">Dynamic interface libraries for Windows</a></b></p>
<p>New dynamic interface libraries have been added for improved linkage from C# or Java on Windows.</p>
<p>• <b><a href="http://software.intel.com/en-us/articles/dynamic-accuracy-control-for-vml/">Routine Level mode controls in VML</a></b></p>
<p>Users can now control or set the accuracy for each function separately in VML with a new argument in each function</p>
<p><i>• </i><b></b><a href="http://software.intel.com/en-us/articles/new-matrix-vector-product-blas-routines/"><b>New symmetric matrix-vector product BLAS routine in blocked storage</b></a></p>
<ul>
<li><a href="http://software.intel.com/en-us/articles/split-complex-real-real-support-for-2d3d-ffts-in-intel-mkl/"><b>Split Complex (real real) support for 2D/3D FFTs</b></a></li>
<li><a href="http://software.intel.com/en-us/articles/new-fast-basic-random-number-generator-sfmt19937-in-intel-mkl/"><b>New fast basic random number generator SFMT19937</b></a></li>
<li><b><a href="http://software.intel.com/en-us/articles/routine-for-linear-fraction-transformation-of-vectors/">New Routine for Linear Fraction Transformation of vectors</a></b></li>
</ul>
<p><br /><a href="http://software.intel.com/en-us/articles/a-new-directory-hierarchy-in-intel-mkl-package/"><b>Directory Changes in the Intel® MKL 10.3 Beta</b></a><b> <br /></b><a href="http://software.intel.com/en-us/articles/intel-mkl-103-bug-fixes/"><b>List of bugs fixed in this release</b></a><br /><br /><br /><b>Registration and Download:</b></p>
<p>1.     Review the <a href="http://software.intel.com/en-us/articles/intel-mkl-103-system-requirements/">Intel<sup>®</sup> MKL 10.3 Beta system requirements</a> at the end of this document.</p>
<p>2.     <a href="https://registrationcenter.intel.com/RegCenter/BetaForm.aspx?ProductID=1459">Click here to begin the registration process</a>.</p>
<p>3.     Provide a valid email address. Installation information will be sent to your email account.</p>
<p>4.     Click the Submit button to obtain a serial number and the URL to download the beta copy.</p>
<p> </p>
<p><b>Additional Documents</b></p>
<p>
<table width="701" cellpadding="0" cellspacing="0" border="1">
<tbody>
<tr>
<td width="134" valign="top">
<p align="center">Link to documents</p>
</td>
<td width="525" valign="top">
<p align="center">Description</p>
</td>
</tr>
<tr>
<td width="134">
<p align="center"> </p>
<p align="center"><a href="http://software.intel.com/en-us/articles/intel-mkl-103-install-guide/">Installation Guide</a></p>
</td>
<td width="525" valign="top">
<p>This document explains how to install and configure for use the Intel<sup>®</sup> MKL 10.3 Beta product. Installation is a multi-step process. Please read this document in its entirety before beginning and follow the steps in sequence.</p>
</td>
</tr>
<tr>
<td width="134">
<p align="center"><a href="http://software.intel.com/en-us/articles/intel-mkl-103-getting-started/">Getting Started Guide</a></p>
</td>
<td width="525" valign="top">
<p>To get started using the library and to find information on APIs and building an application with Intel<sup>®</sup> MKL.</p>
</td>
</tr>
<tr>
<td width="134">
<p align="center"><a href="http://software.intel.com/en-us/articles/intel-mkl-103-release-notes/">Release Notes</a></p>
</td>
<td width="525" valign="top">
<p>This document provides system requirements, installation instructions, issues and limitations, and legal information.</p>
</td>
</tr>
</tbody>
</table>
</p>
<p><br /><b><br />Beta Support and feedback:<br /></b><br />Submit problem reports, usage questions and general feedback to <a href="http://software.intel.com/en-us/forums/intel-math-kernel-library/">Intel MKL User Forum</a>, this forum is exclusively to discuss Intel<sup>®</sup> MKL related information with other developers and Intel engineers.</p>
<p>At the end of beta program a <b>survey</b> will be sent out to all participants. The survey will ask questions about your target platform, new feature usage, Intel MKL product quality and documentation.</p> ]]></description>
      <link>http://software.intel.com/en-us/articles/intel-math-kernel-library-103-beta/</link>
      <pubDate>Fri, 02 Jul 2010 07:30:00 -0700</pubDate>
      <comments>http://software.intel.com/en-us/articles/intel-math-kernel-library-103-beta/#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/intel-math-kernel-library-103-beta/</guid>
      <category>Parallel Programming</category>
      <category>Intel® C++ Compiler for Linux* Knowledge Base</category>
      <category>Intel® C++ Compiler for Mac OS X* Knowledge Base</category>
      <category>Intel® C++ Compiler for Windows* Knowledge Base</category>
      <category>Intel® Fortran Compiler for Linux* Knowledge Base</category>
      <category>Intel® Fortran Compiler for Mac OS X* Knowledge Base</category>
      <category>Intel® Math Kernel Library Knowledge Base</category>
    </item>
  </channel></rss>
