<?xml version="1.0" encoding="UTF-8"?>
<!-- Generated on Tue, 24 Nov 2009 22:57:32 -0800 -->
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <atom:link href="http://software.intel.com/en-us/articles/intel-parallel-composer-kb/type/performance-and-optimization/feed/" rel="self" type="application/rss+xml" />
    <title>Intel Software Network articles feed</title>
    <link>http://software.intel.com/en-us/articles/intel-parallel-composer-kb/performance-and-optimization/</link>
    <description></description>
    <language>en-us</language>
    <item>
      <title>Performance Tools for Software Developers - Loop blocking</title>
      <description><![CDATA[ <p><b>Loop blocking</b> is a combination of strip mining and loop interchange to enhance reuse of local data. It helps the nested loops that manipulate arrays and are too large to fit into the cache. The loop blocking allows reuse of the arrays by transforming the loops such that the transformed loops manipulate array strips that fit into the cache. In effect, a blocked loop uses array elements in sections that are optimally sized to fit in the cache.</p>
<p> </p>
<p>Use cache <b>blocking</b> to arrange a <b>loop</b> so it will perform as many computations as possible on data already residing in cache. (The next <b>block</b> of data is not read into cache until computations using the first <b>block</b> are finished.)</p>
<p>The loop blocking optimization is part of HLO phase in Intel compiler and is available when using compiler option <span style="mso-bidi-font-family: 'Courier New'; mso-ansi-language: EN;" lang="EN">-O3</span>. The compiler uses default heuristics for loop blocking. But you may also use /Qopt-block-factor:n in Windows or -opt-block-factor:n in Linux to specify loop blocking factor.</p>
<p><b>Data reuse:</b></p>
<p>Data reuse is important to understand blocking. There are two types of data reuse associated with loop blocking:</p>
<ul>
<li>Spatial reuse </li>
<li>Temporal reuse</li>
</ul>
<p> </p>
<p><b>Spatial reuse</b></p>
<p>Spatial reuse uses data that was encached as a result of fetching another piece of data from memory. The data is fetched one cache lines at a time. This is 64 bytes for Intel(R) Core2 processors. If the requested data is located at the beginning of the cache line (aligned data), and the rest of the cache line contains subsequent array elements then for float array, this means the requested element and the seven following elements are cached on each fetch after the first. If any of these seven elements could then be used on any subsequent iterations of the loop, the loop would be exploiting spatial reuse. For loops with strides greater than one, spatial reuse can still occur. However, the cache lines contain fewer usable elements.</p>
<p><b>Temporal reuse</b></p>
<p>Temporal reuse uses the same data item in more than one iteration of the loop. If the loop uses the same element in subsequent loop iterations then loop exhibits temporal reuse in the context of the loop. The blocking exploits spatial reuse by ensuring that once fetched, cache lines are not overwritten until their spatial reuse is exhausted.</p>
<p><b>Example 1: Simple Loop Blocking</b></p>
<p>The following example demonstrates the simple loop blocking. The <b>loop blocking</b> allows arrays A and B to be <b>blocked</b> into smaller rectangular chunks so that the total combined size of two <b>blocked</b> (A and B) chunks is smaller than cache size, which can improve data reuse.</p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto;"><span style="font-size: 9.5pt; color: black; font-family: Arial;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: green; font-family: 'Courier New'; mso-no-proof: yes;">// before_loopblocking.cpp</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: green; font-family: 'Courier New'; mso-no-proof: yes;">/*</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: green; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-spacerun: yes;"> </span>*<span style="mso-spacerun: yes;"> </span>icl /Qoption,link,"/STACK:1000000000" before_loopblocking.cpp</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: green; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-spacerun: yes;"> </span>*/</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: green; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: blue; font-family: 'Courier New'; mso-no-proof: yes;">#include</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> <span style="color: maroon;">&lt;time.h&gt;</span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: blue; font-family: 'Courier New'; mso-no-proof: yes;">#include</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> <span style="color: maroon;">&lt;stdio.h&gt;</span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: maroon; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: blue; font-family: 'Courier New'; mso-no-proof: yes;">#define</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> MAX 8000</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: blue; font-family: 'Courier New'; mso-no-proof: yes;">void</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> add(<span style="color: blue;">int</span> a[][MAX], <span style="color: blue;">int</span> b[][MAX]);</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: blue; font-family: 'Courier New'; mso-no-proof: yes;">int</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> main()</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;">{</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: blue;">int</span> i, j;</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: blue;">int</span> A[MAX][MAX];</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: blue;">int</span> B[MAX][MAX];</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span>clock_t<span style="mso-spacerun: yes;"> </span>before, after;</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: green;">//Initialize array</span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: blue;">for</span>(i=0;i&lt;MAX;i++) </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>{</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 2;"> </span><span style="mso-spacerun: yes;"> </span><span style="color: blue;">for</span>(j=0;j&lt;MAX; j++)</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 2;"> </span><span style="mso-spacerun: yes;"> </span>{ </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 4;"> </span>A[i][j]=j;</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 4;"> </span>B[i][j]=j;</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 2;"> </span><span style="mso-spacerun: yes;"> </span>}</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>}</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>before = clock();</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>add(A, B);</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>add(A, B);</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>add(A, B);</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>add(A, B);</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>after = clock();</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>printf(<span style="color: maroon;">"\nTime taken to complete : %7.2lf secs\n"</span>, (<span style="color: blue;">float</span>)(after - before)/ CLOCKS_PER_SEC); <span style="color: green;">//List time taken to complete add function</span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;">}</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: blue; font-family: 'Courier New'; mso-no-proof: yes;">void</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> add(<span style="color: blue;">int</span> a[][MAX], <span style="color: blue;">int</span> b[][MAX])</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;">{</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span><span style="color: blue;">int</span> i, j;</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span><span style="color: blue;">for</span>(i=0;i&lt;MAX;i++) </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>{</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 2;"> </span><span style="mso-spacerun: yes;"> </span><span style="color: blue;">for</span>(j=0; j&lt;MAX;j++)</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 2;"> </span><span style="mso-spacerun: yes;"> </span>{ </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 4;"> </span>a[i][j] = a[i][j] + b[j][i]; <span style="color: green;">//Adds two matrices</span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 2;"> </span><span style="mso-spacerun: yes;"> </span>}</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>}</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;">}</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto;"><span style="color: black; mso-bidi-font-family: Arial; mso-bidi-font-size: 9.5pt;"><span style="font-size: small;"><span style="font-family: Times New Roman;">The above code is modified below to enhance reuse of the cached data:</span></span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: green; font-family: 'Courier New'; mso-no-proof: yes;">// after_loopblocking.cpp</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: green; font-family: 'Courier New'; mso-no-proof: yes;">/*</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: green; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-spacerun: yes;"> </span>*<span style="mso-spacerun: yes;"> </span>icl /Qoption,link,"/STACK:1000000000" after_loopblocking.cpp</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: green; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-spacerun: yes;"> </span>*/</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: blue; font-family: 'Courier New'; mso-no-proof: yes;">#include</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> <span style="color: maroon;">&lt;stdio.h&gt;</span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: blue; font-family: 'Courier New'; mso-no-proof: yes;">#include</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> <span style="color: maroon;">&lt;time.h&gt;</span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: maroon; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: blue; font-family: 'Courier New'; mso-no-proof: yes;">#define</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> MAX 8000</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: blue; font-family: 'Courier New'; mso-no-proof: yes;">#define</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> BS 16<span style="mso-spacerun: yes;"> </span><span style="color: green;">//Block size is selected as the loop-blocking factor. </span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: green; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: blue; font-family: 'Courier New'; mso-no-proof: yes;">void</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> add(<span style="color: blue;">int</span> a[][MAX], <span style="color: blue;">int</span> b[][MAX]);</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: blue; font-family: 'Courier New'; mso-no-proof: yes;">int</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> main()</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;">{</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: blue;">int</span> i, j;</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: blue;">int</span> A[MAX][MAX];</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: blue;">int</span> B[MAX][MAX];</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span>clock_t<span style="mso-spacerun: yes;"> </span>before, after;</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: green;">//Initialize array</span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: blue;">for</span>(i=0;i&lt;MAX;i++) </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>{</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 2;"> </span><span style="mso-spacerun: yes;"> </span><span style="color: blue;">for</span>(j=0;j&lt;MAX; j++)</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 2;"> </span><span style="mso-spacerun: yes;"> </span>{ </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 4;"> </span>A[i][j]=j;</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 4;"> </span>B[i][j]=j;</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 2;"> </span><span style="mso-spacerun: yes;"> </span>}</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>}</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>before = clock();</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>add(A, B);</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>add(A, B);</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>add(A, B);</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>add(A, B);</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>after = clock();</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>printf(<span style="color: maroon;">"\nTime taken to complete : %7.2lf secs\n"</span>, (<span style="color: blue;">float</span>)(after - before)/ CLOCKS_PER_SEC); <span style="color: green;">//List time taken to complete add function</span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: green; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;">}</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: blue; font-family: 'Courier New'; mso-no-proof: yes;">void</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> add(<span style="color: blue;">int</span> a[][MAX], <span style="color: blue;">int</span> b[][MAX])</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;">{</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: blue;">int</span> i, j, ii, jj;</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: blue;">for</span>(i=0;i&lt;MAX;i+=BS) </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span>{</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 2;"> </span><span style="mso-spacerun: yes;"> </span><span style="color: blue;">for</span>(j=0; j&lt;MAX;j+=BS)</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 2;"> </span><span style="mso-spacerun: yes;"> </span>{ </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 3;"> </span><span style="mso-spacerun: yes;"> </span><span style="color: blue;">for</span>(ii=i; ii&lt;i+BS; ii++)<span style="color: green;">//outer loop</span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 3;"> </span><span style="mso-spacerun: yes;"> </span>{</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 3;"> </span><span style="mso-tab-count: 1;"> </span><span style="color: blue;">for</span>(jj=j; jj&lt;j+BS; jj++)</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt 2in; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;">{<span style="mso-spacerun: yes;"> </span><span style="mso-spacerun: yes;"> </span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt 2in; text-indent: 0.5in; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: green; font-family: 'Courier New'; mso-no-proof: yes;">//Array B experiences one cache miss</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt 2in; text-indent: 0.5in; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: green; font-family: 'Courier New'; mso-no-proof: yes;">//for every iteration of outer loop</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 5;"> </span>a[ii][jj] = a[ii][jj] + b[jj][ii];<span style="mso-tab-count: 5;"> </span><span style="mso-tab-count: 1;"> </span>}</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 3;"> </span><span style="mso-spacerun: yes;"> </span>}</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 2;"> </span><span style="mso-spacerun: yes;"> </span>}</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>}</span></p>
<p class="MsoNormal" style="margin: 3pt 0in 9pt;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;">}</span></p>
<p class="MsoNormal" style="margin: 3pt 0in 9pt;"><span style="font-size: 9.5pt; color: black; font-family: Arial; mso-ansi-language: EN;" lang="EN"> </span></p>
<p><b>Example 2: Complex Blocking</b></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: green; font-family: 'Courier New'; mso-no-proof: yes;">// matrixMul.cpp</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: green; font-family: 'Courier New'; mso-no-proof: yes;">/*</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: green; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-spacerun: yes;"> </span>*<span style="mso-spacerun: yes;"> </span>icl /Qoption,link,"/STACK:1000000000" matrixMul.cpp</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: green; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-spacerun: yes;"> </span>*/</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: blue; font-family: 'Courier New'; mso-no-proof: yes;">#include</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> <span style="color: maroon;">&lt;stdio.h&gt;</span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: blue; font-family: 'Courier New'; mso-no-proof: yes;">#include</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> <span style="color: maroon;">&lt;time.h&gt;</span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: maroon; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: blue; font-family: 'Courier New'; mso-no-proof: yes;">#define</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> MAX 800</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: blue; font-family: 'Courier New'; mso-no-proof: yes;">void</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> matmul(<span style="color: blue;">int</span> c[][MAX], <span style="color: blue;">int</span> a[][MAX], <span style="color: blue;">int</span> b[][MAX]);</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: blue; font-family: 'Courier New'; mso-no-proof: yes;">int</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> main()</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;">{</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: blue;">int</span> i, j;</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: blue;">int</span> A[MAX][MAX];</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: blue;">int</span> B[MAX][MAX];</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: blue;">int</span> C[MAX][MAX];</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span>clock_t<span style="mso-spacerun: yes;"> </span>before, after;</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: green;">//Initialize array</span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: blue;">for</span>(i=0;i&lt;MAX;i++) </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>{</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 2;"> </span><span style="mso-spacerun: yes;"> </span><span style="color: blue;">for</span>(j=0;j&lt;MAX; j++)</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 2;"> </span><span style="mso-spacerun: yes;"> </span>{ </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 4;"> </span>A[i][j]=j;</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 4;"> </span>B[i][j]=j;</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 2;"> </span><span style="mso-spacerun: yes;"> </span>}</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>}</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>before = clock();</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>matmul(C, A, B);</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>after = clock();</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>printf(<span style="color: maroon;">"\nTime taken to complete : %7.2lf secs\n"</span>, (<span style="color: blue;">float</span>)(after - before)/ CLOCKS_PER_SEC); <span style="color: green;">//List time taken to complete add function</span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: green; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;">}</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: blue; font-family: 'Courier New'; mso-no-proof: yes;">void</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> matmul(<span style="color: blue;">int</span> c[][MAX], <span style="color: blue;">int</span> a[][MAX], <span style="color: blue;">int</span> b[][MAX])</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;">{</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: blue;">int</span> i, j, k;</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: blue;">for</span>(i=0;i&lt;MAX;i++) </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span>{</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 2;"> </span><span style="mso-spacerun: yes;"> </span><span style="color: blue;">for</span>(j=0; j&lt;MAX;j++)</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 2;"> </span><span style="mso-spacerun: yes;"> </span>{ </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 3;"> </span><span style="mso-spacerun: yes;"> </span><span style="color: blue;">for</span>(k=0; k &lt; MAX; k++)</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 3;"> </span><span style="mso-spacerun: yes;"> </span></span><span style="font-size: 10pt; font-family: 'Courier New'; mso-ansi-language: IT; mso-no-proof: yes;" lang="IT">{ </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-ansi-language: IT; mso-no-proof: yes;" lang="IT"><span style="mso-tab-count: 5;"> </span>c[i][j] = c[i][j] + a[i][k] * b[k][j]; </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-ansi-language: IT; mso-no-proof: yes;" lang="IT"><span style="mso-tab-count: 3;"> </span><span style="mso-spacerun: yes;"> </span></span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;">}</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 2;"> </span><span style="mso-spacerun: yes;"> </span>}</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>}</span></p>
<p class="MsoNormal" style="margin: 3pt 0in 9pt;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;">}</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto;"><span style="font-size: small;"><span style="font-family: Times New Roman;"><span style="color: black; mso-bidi-font-family: Arial; mso-bidi-font-size: 9.5pt;">The above code is modified below to enhance </span><span style="color: black; mso-bidi-font-family: Arial; mso-ansi-language: EN; mso-bidi-font-size: 9.5pt;" lang="EN">spatial</span><span style="color: black; mso-bidi-font-family: Arial; mso-bidi-font-size: 9.5pt;"> and </span><span style="color: black; mso-bidi-font-family: Arial; mso-ansi-language: EN; mso-bidi-font-size: 9.5pt;" lang="EN">temporal</span><span style="color: black; mso-bidi-font-family: Arial; mso-bidi-font-size: 9.5pt;"> reuse of the cached data for array a, b and c:</span></span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: green; font-family: 'Courier New'; mso-no-proof: yes;">// matrixMulBlk.cpp</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: green; font-family: 'Courier New'; mso-no-proof: yes;">/*</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: green; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-spacerun: yes;"> </span>*<span style="mso-spacerun: yes;"> </span>icl /Qoption,link,"/STACK:1000000000" matrixMulBlk.cpp</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: green; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-spacerun: yes;"> </span>*/</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: blue; font-family: 'Courier New'; mso-no-proof: yes;">#include</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> <span style="color: maroon;">&lt;stdio.h&gt;</span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: blue; font-family: 'Courier New'; mso-no-proof: yes;">#include</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> <span style="color: maroon;">&lt;time.h&gt;</span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: maroon; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: blue; font-family: 'Courier New'; mso-no-proof: yes;">#define</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> MAX 800</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: blue; font-family: 'Courier New'; mso-no-proof: yes;">#define</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> BS 16<span style="mso-spacerun: yes;"> </span><span style="color: green;">//Block size is selected as the loop-blocking factor. </span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: green; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: blue; font-family: 'Courier New'; mso-no-proof: yes;">void</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> matmul(<span style="color: blue;">int</span> c[][MAX], <span style="color: blue;">int</span> a[][MAX], <span style="color: blue;">int</span> b[][MAX]);</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: blue; font-family: 'Courier New'; mso-no-proof: yes;">int</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> main()</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;">{</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: blue;">int</span> i, j;</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: blue;">int</span> A[MAX][MAX];</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: blue;">int</span> B[MAX][MAX];</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: blue;">int</span> C[MAX][MAX];</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span>clock_t<span style="mso-spacerun: yes;"> </span>before, after;</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: green;">//Initialize array</span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: blue;">for</span>(i=0;i&lt;MAX;i++) </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>{</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 2;"> </span><span style="mso-spacerun: yes;"> </span><span style="color: blue;">for</span>(j=0;j&lt;MAX; j++)</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 2;"> </span><span style="mso-spacerun: yes;"> </span>{ </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 4;"> </span>A[i][j]=j;</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 4;"> </span>B[i][j]=j;</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 2;"> </span><span style="mso-spacerun: yes;"> </span>}</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>}</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>before = clock();</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>matmul(C, A, B);</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>after = clock();</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>printf(<span style="color: maroon;">"\nTime taken to complete : %7.2lf secs\n"</span>, (<span style="color: blue;">float</span>)(after - before)/ CLOCKS_PER_SEC); <span style="color: green;">//List time taken to complete add function</span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: green; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;">}</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: blue; font-family: 'Courier New'; mso-no-proof: yes;">void</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> matmul(<span style="color: blue;">int</span> c[][MAX], <span style="color: blue;">int</span> a[][MAX], <span style="color: blue;">int</span> b[][MAX])</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;">{</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: blue;">int</span> i, j, k, jj, kk;</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: blue;">for</span>(j=0;j&lt;MAX; j += BS) </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span>{</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span><span style="color: blue;">for</span>(k=0; k&lt;MAX; k += BS)</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>{ </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span><span style="color: blue;">for</span>(i=0; i &lt; MAX; i++)</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 2;"> </span><span style="mso-spacerun: yes;"> </span>{ </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 3;"> </span><span style="color: blue;">for</span>(kk=k; kk&lt;k+BS; kk++)</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 3;"> </span>{</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt 1.5in; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: blue; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-spacerun: yes;"> </span>for</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;">(jj=j; jj&lt;j+BS; jj++) </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 2;"> </span><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>{<span style="mso-spacerun: yes;"> </span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 4;"> </span>c[i][jj] = (c[</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-ansi-language: IT; mso-no-proof: yes;" lang="IT">i][jj] + a[i][kk] * b[kk][jj]); </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-ansi-language: IT; mso-no-proof: yes;" lang="IT"><span style="mso-tab-count: 3;"> </span><span style="mso-spacerun: yes;"> </span></span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;">}</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 3;"> </span>}</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 2;"> </span><span style="mso-spacerun: yes;"> </span>}</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>}</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>}</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;">}</span></p>
<p class="MsoNormal" style="margin: 3pt 0in 9pt;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<!--CTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dt--> ]]></description>
      <link>http://software.intel.com/en-us/articles/performance-tools-for-software-developers-loop-blocking</link>
      <pubDate>Mon, 13 Jul 2009 15:36:15 -0700</pubDate>
      <comments>http://software.intel.com/en-us/articles/performance-tools-for-software-developers-loop-blocking#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/performance-tools-for-software-developers-loop-blocking</guid>
      <category>Intel® C++ Compiler for Linux* Knowledge Base</category>
      <category>Intel® C++ Compiler for Mac OS X* Knowledge Base</category>
      <category>Intel® C++ Compiler for Windows* Knowledge Base</category>
      <category>Intel® Parallel Composer Knowledge Base</category>
    </item>
    <item>
      <title>Performance Tools for Software Developers - Auto parallelization and  /Qpar-threshold</title>
      <description><![CDATA[ <!--CTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dt--> 
<table border="0" cellpadding="0" cellspacing="15">
<tbody>
<tr>
<td class="bodycopy">
<p>The auto-parallelization feature of the Intel C++ Compiler automatically translates serial portions of the input program into semantically equivalent multithreaded code. Automatic parallelization determines the loops that are good work sharing candidates, performs the dataflow analysis to verify correct parallel execution, and partitions the data for threaded code generation as is needed in programming with OpenMP directives. The OpenMP and Auto-parallelization applications provide the performance gains from shared memory on multiprocessor systems, IA-32, Intel 64 and Itanium processors.</p>
<p>The following table lists the options that enable Auto-parallelization:</p>
<blockquote><b>/Qparallel:</b><br />Enables the auto-parallelizer to generate multithreaded code for loops that can be safely executed in parallel. <br /><br /><b>/Qpar-threshold:n</b><br />This option sets a threshold for the auto-parallelization of loops based on the probability of profitable execution of the loop in parallel. To use this option, you must also specify -parallel (Linux and Mac OS X) or /Qparallel (Windows). The default is /Qpar-threshold:100.</blockquote>
<p>This option is useful for loops whose computation work volume cannot be determined at compile-time. The threshold is usually relevant when the loop trip count is unknown at compile-time.</p>
<p>The compiler applies a heuristic that tries to balance the overhead of creating multiple threads versus the amount of work available to be shared amongst the threads.</p>
<p>The n is an integer whose value is the threshold for the auto-parallelization of loops. Possible values are 0 through 100. If <i>n</i> is 0, loops get auto-parallelized always, regardless of computation work volume. If <i>n</i> is 100, loops get auto-parallelized when performance gains are predicted based on the compiler analysis data. Loops get auto-parallelized only if profitable parallel execution is almost certain. The intermediate 1 to 99 values represent the percentage probability for profitable speed-up. For example, <i>n</i>=50 directs the compiler to parallelize only if there is a 50% probability of the code speeding up if executed in parallel.</p>
<p>Also, to be "100%" sure that a loop will benefit from parallelization, the compiler needs to know the iteration count at compile time. For a "99%" or lower threshold, knowing the iteration count at compile time is not a requirement.</p>
<p>This leads to a big difference in the number of loops parallelized at 99% compared to 100%. For many apps, 99% is a better setting, but for some apps with a lot of short loops, 99% will slow them down.</p>
<p>The following example, int_sin.c, does not auto parallelize when we use /Qpar-threshold:100 using command line below :</p>
<blockquote>C: &gt;icl -c /Qparallel /Qpar-report3 /Qpar-threshold:100 int_sin.cquote&gt;
<p>If we use /Qpar-threshold:99 then it is parallelized.</p>
<p><b>Example:</b></p>
<p class="whs23" style="MARGIN: auto 0in 0pt"><b style="mso-bidi-font-weight: normal"></b></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; COLOR: green; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">// int_sin.c</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; COLOR: green; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">// Intel C++ compiler sample program</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"> </p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">#include</span><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"><span style="COLOR: maroon">&lt;stdio.h&gt;</span></span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">#include</span><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"><span style="COLOR: maroon">&lt;stdlib.h&gt;</span></span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">#include</span><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"><span style="COLOR: maroon">&lt;time.h&gt;</span></span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">#include</span><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"><span style="COLOR: maroon">&lt;mathimf.h&gt;</span></span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"> </p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; COLOR: green; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">// Function to be integrated</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; COLOR: green; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">// Define and prototype it here</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; COLOR: green; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">// | sin(x) |</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">#define</span><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">INTEG_FUNC(x) fabs(sin(x))</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"> </p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; COLOR: green; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">// Prototype timing function</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">double</span><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">dclock( <span style="COLOR: blue">void</span>);</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"> </p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">int</span><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">main( <span style="COLOR: blue">void</span>)</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">{</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"><span style="COLOR: green">// Loop counters and number of interior points</span></span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"><span style="COLOR: blue">unsigned</span><span style="COLOR: blue">int</span> i, j, N;</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"><span style="COLOR: green">// Stepsize, independent variable x, and accumulated sum</span></span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"><span style="COLOR: blue">double</span> step, x_i, sum;</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"><span style="COLOR: green">// Timing variables for evaluation </span></span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"><span style="COLOR: blue">double</span> start, finish, duration, clock_t;</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"><span style="COLOR: green">// Start integral from</span></span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"><span style="COLOR: blue">double</span> interval_begin = 0.0;</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-ali gn: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"><span style="COLOR: green">// Complete integral at</span></span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"><span style="COLOR: blue">double</span> interval_end = 2.0 * 3.141592653589793238;</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"> </p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"><span style="COLOR: green">// Start timing for the entire application</span></span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">start = clock();</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"> </p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">printf( <span style="COLOR: maroon">" "</span>);</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">printf( <span style="COLOR: maroon">" Number of | Computed Integral | "</span>);</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">printf( <span style="COLOR: maroon">" Interior Points | | "</span>);</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"><span style="COLOR: blue">for</span> (j=2;j&lt;10;j++)</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">{</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">printf( <span style="COLOR: maroon">"------------------------------------- "</span>);</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"> </p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"><span style="COLOR: green">// Compute the number of (internal rectangles + 1)</span></span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">N = 1 &lt;&lt; j;</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"> </p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"><span style="COLOR: green">// Compute stepsize for N-1 internal rectangles</span></span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">step = (interval_end - interval_begin) / N;</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"> </p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"><span style="COLOR: green">// Approx. 1/2 area in first rectangle: f(x0) * [step/2]</span></span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">sum = INTEG_FUNC(interval_begin) * step / 2.0;</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"> </p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"><span style="COLOR: green">// Apply midpoint rule:</span></span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"><span style="COLOR: green">// Given length = f(x), compute the area of the</span></span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"><span style="COLOR: green">// rectangle of width step</span></span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"><span style="COLOR: green">// Sum areas of internal rectangle: f(xi + step) * step</span></span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"> </p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"><span style="COLOR: blue">for</span> (i=1;i&lt;N;i++)</span></p>
<span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">{</span>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">x_i = i * step;</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">sum += INTEG_FUNC(x_i) * step;</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">}</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"> </p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"><span style="COLOR: green">// Approx. 1/2 area in last rectangle: f(xN) * [step/2]</span></span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">sum += INTEG_FUNC(interval_end) * step / 2.0;</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"> </p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes; mso-ansi-language: IT" lang="IT">printf( <span style="COLOR: maroon">" %10d | %14e | "</span>, N, sum);</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">}</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">finish = clock();</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">duration = (finish - start);</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">printf( <span style="COLOR: maroon">" "</span>);</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt;  FONT-FAMILY: 'Courier New'; mso-no-proof: yes">printf( <span style="COLOR: maroon">" Application Clocks = %10e "</span>, duration);</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">printf( <span style="COLOR: maroon">" "</span>);</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="mso-no-proof: yes"><span style="font-size: small; font-family: Times New Roman;">}</span></span></p>
</blockquote>
</td>
</tr>
</tbody>
</table>
<table border="0" cellpadding="0" cellspacing="0">
<tbody>
<tr>
<td><img src="http://software.intel.com/file/6324" height="5" width="388" /></td>
</tr>
<tr>
<td height="10"></td>
</tr>
</tbody>
</table> ]]></description>
      <link>http://software.intel.com/en-us/articles/performance-tools-for-software-developers-auto-parallelization-and-qpar-threshold</link>
      <pubDate>Mon, 13 Jul 2009 15:32:16 -0700</pubDate>
      <comments>http://software.intel.com/en-us/articles/performance-tools-for-software-developers-auto-parallelization-and-qpar-threshold#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/performance-tools-for-software-developers-auto-parallelization-and-qpar-threshold</guid>
      <category>Intel® C++ Compiler for Linux* Knowledge Base</category>
      <category>Intel® C++ Compiler for Mac OS X* Knowledge Base</category>
      <category>Intel® C++ Compiler for Windows* Knowledge Base</category>
      <category>Intel® Parallel Composer Knowledge Base</category>
    </item>
    <item>
      <title>Excerpts from Intel IPP 2nd Edition Book on Threading Support</title>
      <description><![CDATA[ <p>With more Multi-core , many-core based systems available on the market, there are more interest to understand how Intel IPP covers threading support.  Mainly we cover threading support in 2 levels: one is inside of Intel IPP API primitives, some of Intel IPP are internally threaded, (check this KB for more details), another one is in high leve via Intel IPP Samples, a lot of application implementations in Intel IPP Sample offering also adopt OpenMP or Native threading mechanism to maximize performance for image coding, video coding and more on Intel Multi-core and many-core based systems. You can find more details when evaluating <a target="_blank" href="http://software.intel.com/en-us/articles/intel-integrated-performance-primitives-samples-license-agreement/">Intel IPP Samples.</a><br /><br />In addtion to that, the Intel IPP 2nd Edition book also explains a variety of threading support in different usage models, Download 4 excerpts from this edition where explains how to use threading in Graphics, Image processing, Image coding and video coding. <br /><br />Please also visit <a target="_blank" href="http://www.intel.com/intelpress/sum_ipp2.htm">Intel Press </a>for more info on Intel IPP book.</p> ]]></description>
      <link>http://software.intel.com/en-us/articles/excerpts-from-intel-ipp-book-on-threading-support</link>
      <pubDate>Mon, 22 Jun 2009 23:19:00 -0700</pubDate>
      <comments>http://software.intel.com/en-us/articles/excerpts-from-intel-ipp-book-on-threading-support#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/excerpts-from-intel-ipp-book-on-threading-support</guid>
      <category>Intel® Integrated Performance Primitives Knowledge Base</category>
      <category>Intel® Parallel Composer Knowledge Base</category>
    </item>
    <item>
      <title>Intel® Core™ i7 processor Support</title>
      <description><![CDATA[ <p><span style="font-size: 10pt; line-height: 115%; font-family: &quot;Verdana&quot;,&quot;sans-serif&quot;; mso-bidi-font-size: 11.0pt; mso-fareast-font-family: SimSun; mso-fareast-theme-font: minor-fareast; mso-bidi-font-family: 'Times New Roman'; mso-bidi-theme-font: minor-bidi; mso-ansi-language: EN-US; mso-fareast-language: ZH-CN; mso-bidi-language: AR-SA;">The Intel IPP v6.0 and later version supports the latest Intel® Core™ i7 processor (codenamed "Nehalem"). There are several Intel IPP string processing functions like <em style="mso-bidi-font-style: normal;">ippsFind*Any()</em> functions and tranformation functions that are specially optimized for Intel Core i7 processors for additional performance benefits .<span style="mso-spacerun: yes;">  </span>All Intel IPP functions will continue to use “<strong style="mso-bidi-font-weight: normal;">p8</strong>” optimized libraries for IA-32 and “<strong style="mso-bidi-font-weight: normal;">y8</strong>” optimized libraries for Intel® 64 when you target Intel Core i7 processors. The “<strong style="mso-bidi-font-weight: normal;">p8</strong>” and “<strong style="mso-bidi-font-weight: normal;">y8</strong>” optimized libraries in Intel IPP are generally optimized for Intel® Streaming SIMD Extensions 4 (Intel® SSE4).<br /><br />For all complete Intel IPP supported cpu identifiers, please refer <a href="http://software.intel.com/en-us/articles/intel-integrated-performance-primitives-intel-ipp-understanding-cpu-optimized-code-used-in-intel-ipp"><span style="color: #ff79c2;">this article</span></a> or check “<em style="mso-bidi-font-style: normal;">Getting_Started.htm</em>” or “<em style="mso-bidi-font-style: normal;">userguide_*.pdf</em>” from IPP \doc directory.<br /></span><span style="font-size: 10pt; line-height: 115%; font-family: &quot;Verdana&quot;,&quot;sans-serif&quot;; mso-bidi-font-size: 11.0pt; mso-fareast-font-family: SimSun; mso-fareast-theme-font: minor-fareast; mso-bidi-font-family: 'Times New Roman'; mso-bidi-theme-font: minor-bidi; mso-ansi-language: EN-US; mso-fareast-language: ZH-CN; mso-bidi-language: AR-SA;"><br />To find out more Intel IPP APIs performance results on Intel Core i7 processor, you can run Intel IPP Performance Test tool on this target system. The tool is available under IPP directory \tools\perfsys.<br />Please check "<em>readme.htm"</em> in this folder and also <a href="http://software.intel.com/en-us/articles/intel-integrated-performance-primitives-intel-ipp-using-the-performance-tool">an article </a>at Intel IPP Knowledge Base for more information.<br /><br />A lot of Intel IPP samples like Audio/Video sample, new Unified Image codec (UIC) sample aslo provide performance results for decoding/encoding as part of output data.  Please check <a href="http://www.intel.com/software/products/ipp">Intel IPP Web site</a> and click Sample link to download.</span></p> ]]></description>
      <link>http://software.intel.com/en-us/articles/new-nehalem-support</link>
      <pubDate>Sun, 14 Jun 2009 22:36:17 -0700</pubDate>
      <comments>http://software.intel.com/en-us/articles/new-nehalem-support#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/new-nehalem-support</guid>
      <category>Intel® Integrated Performance Primitives Knowledge Base</category>
      <category>Intel® Parallel Composer Knowledge Base</category>
    </item>
    <item>
      <title>OpenMP* Loops with Function Calls for Bounds May Not Parallelize</title>
      <description><![CDATA[ <br />
<div id="art_pre_template"><strong>Reference Number :</strong>  DPD200110877<br /><br /><br /><strong>Version :</strong> 11.0, 11.1 or Intel® Parallel Composer<br /><br /><br /><strong>Operating System : </strong>Windows*, Linux*, Mac OS X*<br /><br /><br /><strong>Problem Description : </strong>The OpenMP* 3.0 standard now supports using STL iterators for OpenMP loop bounds.  However, the Intel® C++ Compiler does not parallelize code like the following:<br /><br />
<pre name="code" class="cpp">#include &lt;vector&gt;

void iterator_example()
{
  std::vector&lt;double&gt; vec(23);
  std::vector&lt;double&gt;::iterator it;

#pragma omp parallel for default(none) shared(vec) 
  for (it = vec.begin(); it &lt; vec.end(); it++)
  {
    *it = 1.0;// do work with *it //
  }
}</pre>
<br /><br />The compiler will not give an indication (as it should) that the loop was parallelized for OpenMP*.  If you examine the code, you will see that the compiler generates a serial version of the loop.  This is because of an issue with the compiler using function calls on loop bounds that are inlined causing the compiler to not recognize the loop as being a validly formed loop for parallelization.<br /><br /><br /><strong>Resolution Status : </strong>This will be resolved in an upcoming compiler update.<br /><br /><br /><br /><em>[DISCLAIMER: The information on this web site is intended for hardware system manufacturers and software developers. Intel does not warrant the accuracy, completeness or utility of any information on this site. Intel may make changes to the information or the site at any time without notice. Intel makes no commitment to update the information at this site. ALL INFORMATION PROVIDED ON THIS WEBSITE IS PROVIDED "as is" without any express, implied, or statutory warranty of any kind including but not limited to warranties of merchantability, non-infringement of intellectual property, or fitness for any particular purpose. Independent companies manufacture the third-party products that are mentioned on this site. Intel is not responsible for the quality or performance of third-party products and makes no representation or warranty regarding such products. The third-party supplier remains solely responsible for the design, manufacture, sale and functionality of its products. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. *Other names and brands may be claimed as the property of others.]</em></div> ]]></description>
      <link>http://software.intel.com/en-us/articles/openmp-loops-with-function-calls-for-bounds-may-not-parallelize</link>
      <pubDate>Thu, 12 Mar 2009 17:06:43 -0700</pubDate>
      <comments>http://software.intel.com/en-us/articles/openmp-loops-with-function-calls-for-bounds-may-not-parallelize#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/openmp-loops-with-function-calls-for-bounds-may-not-parallelize</guid>
      <category>Intel® C++ Compiler for Linux* Knowledge Base</category>
      <category>Intel® C++ Compiler for Mac OS X* Knowledge Base</category>
      <category>Intel® C++ Compiler for Windows* Knowledge Base</category>
      <category>Intel® Parallel Composer Knowledge Base</category>
    </item>
    <item>
      <title>Disable movbe to Test Intel® Atom™ Processor Targeted Code on non-Intel® Atom™ Processor Platforms</title>
      <description><![CDATA[ <p>The Intel® Compilers 11.0 allow you to target the Intel® Atom™ processor using the /QxSSE3_ATOM or -xSSE3_ATOM compiler options.  These options enable the generation of the movbe instruction which is unique to the Intel® Atom™ processor.  However, there is sometimes a need to run such codes on a different processor such as the Intel® Pentium® III processor (for example, for validation purposes where an Intel® Atom™ processor isn't available).  In these situations, the compiler provides the /Qinstruction:nomovbe (for Windows*) and -minstruction=nomovbe (for Linux*/Mac*) options to disable the generation of this instruction.</p> ]]></description>
      <link>http://software.intel.com/en-us/articles/disable-movbe-to-test-intel-atom-targeted-code-on-non-atom-platforms</link>
      <pubDate>Fri, 20 Feb 2009 16:41:09 -0800</pubDate>
      <comments>http://software.intel.com/en-us/articles/disable-movbe-to-test-intel-atom-targeted-code-on-non-atom-platforms#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/disable-movbe-to-test-intel-atom-targeted-code-on-non-atom-platforms</guid>
      <category>Intel® C++ Compiler for Linux* Knowledge Base</category>
      <category>Intel® C++ Compiler for Mac OS X* Knowledge Base</category>
      <category>Intel® C++ Compiler for Windows* Knowledge Base</category>
      <category>Intel® Fortran Compiler for Linux* Knowledge Base</category>
      <category>Intel® Fortran Compiler for Mac OS X* Knowledge Base</category>
      <category>Intel® Parallel Composer Knowledge Base</category>
      <category>Intel® Visual Fortran Compiler for Windows* Knowledge Base</category>
    </item>
    <item>
      <title>Intel® IPP threaded functions</title>
      <description><![CDATA[ <!--CTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dt-->
<table border="0" cellspacing="15" cellpadding="0">
<tbody>
<tr>
<td class="bodycopy">
<p>Some Intel® IPP functions contain OpenMP* code, which give significant performance gain on multi-processor and multi-core systems. With each version of Intel IPP, we offer updated threaded API lists. These functions include color conversion, filtering, convolution, cryptography, cross correlation, matrix computation, square distance, and bit reduction, etc.</p>
<p>Since version 5.3 and later, you can find the threaded API list "<strong><em>ThreadedFunctionsList.txt</em></strong>" under the standalone IPP product  [IPP InstallDir]\doc directory after completing Intel IPP installation. <br />Or if Intel® C/C++ compiler profession version which inlcudes Intel IPP,  it is located in [Compiler INstallDir]\Documentation\ipp\<strong><em><br /></em></strong>Of if Intel® Parallel Studio which includes Intel IPP , it is located in [Parallel Studio INstallDir]\Parallel Studio\Composer\Documentation\en_US\ipp\</p>
<p>To find the detailed threaded API list prior to v5.3, please download the file from each version:</p>
<table border="0" cellspacing="15" cellpadding="0">
<tbody>
<tr>
<td bgcolor="#a6a6a6">
<table border="0" cellspacing="1" cellpadding="5">
<tbody>
<tr>
<td class="bodycopy" bgcolor="#efefef"><strong>IPP Version</strong></td>
<td class="bodycopy" bgcolor="#efefef"><strong>Download</strong></td>
</tr>
<tr>
<td class="bodycopy" bgcolor="#ffffff">5.2</td>
<td class="bodycopy" bgcolor="#ffffff"><a href="http://downloadcenter.intel.com/Detail_Desc.aspx?agr=N&amp;ProductID=574&amp;DwnldID=13502&amp;strOSs=All&amp;OSFullname=All%20Operating%20Systems&amp;lang=eng">ipp52omp.ini</a></td>
</tr>
<tr>
<td class="bodycopy" bgcolor="#ffffff">5.1</td>
<td class="bodycopy" bgcolor="#ffffff"><a href="http://downloadcenter.intel.com/Detail_Desc.aspx?strState=LIVE&amp;ProductID=42&amp;DwnldID=10290&amp;agr=N&amp;lang=eng&amp;PrdMap=42">ipp51omp.ini</a></td>
</tr>
<tr>
<td class="bodycopy" bgcolor="#ffffff">5.0</td>
<td class="bodycopy" bgcolor="#ffffff"><a href="http://downloadcenter.intel.com/Detail_Desc.aspx?agr=N&amp;ProductID=574&amp;DwnldID=8643&amp;strOSs=44&amp;OSFullname=Windows*%20XP%20Professional&amp;lang=eng">ipp50omp.ini</a></td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
<p>For more topics related to Intel IPP threading and OpenMP support: <a href="/en-us/articles/intel-integrated-performance-primitives-intel-ipp-threading-openmp-faq">http://software.intel.com/en-us/articles/intel-integrated-performance-primitives-intel-ipp-threading-openmp-faq</a></p>
</td>
</tr>
</tbody>
</table>
<table border="0" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td><img src="http://software.intel.com/file/6324" alt="" width="388" height="5" /></td>
</tr>
<tr>
<td height="10"> </td>
</tr>
</tbody>
</table> ]]></description>
      <link>http://software.intel.com/en-us/articles/intel-integrated-performance-primitives-intel-ipp-intel-ipp-threaded-functions</link>
      <pubDate>Fri, 19 Sep 2008 00:00:00 -0700</pubDate>
      <comments>http://software.intel.com/en-us/articles/intel-integrated-performance-primitives-intel-ipp-intel-ipp-threaded-functions#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/intel-integrated-performance-primitives-intel-ipp-intel-ipp-threaded-functions</guid>
      <category>Intel® Integrated Performance Primitives Knowledge Base</category>
      <category>Intel® Parallel Composer Knowledge Base</category>
    </item>
  </channel></rss>