<?xml version="1.0" encoding="UTF-8"?>
<!-- Generated on Tue, 24 Nov 2009 17:03:59 -0800 -->
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <atom:link href="http://software.intel.com/en-us/articles/intel-c-compiler-for-windows-kb/type/performance-and-optimization/feed/" rel="self" type="application/rss+xml" />
    <title>Intel Software Network articles feed</title>
    <link>http://software.intel.com/en-us/articles/intel-c-compiler-for-windows-kb/performance-and-optimization/</link>
    <description></description>
    <language>en-us</language>
    <item>
      <title>How to Compile for Intel® AVX</title>
      <description><![CDATA[ <div id="art_pre_template">Intel® AVX (Intel® Advanced Vector Extensions) is a 256 bit instruction set extension to Intel® SSE (Intel® Streaming SIMD Extensions) that was first announced in 2008. Further information about Intel AVX is available at <a href="http://software.intel.com/en-us/avx/">http://software.intel.com/en-us/avx/</a> .<br /><br />The Intel C/C++ and Fortran Compilers, version 11.1, support the building of applications for Intel AVX. On Windows*, use the command line switch /QxAVX. On Linux*, use –xavx. The switches /QaxAVX (Windows) and –axavx (Linux) may be used to build applications that will take advantage of AVX instructions on Intel systems that support these, but will use only SSE instructions on other systems.<br /><br />Both C/C++ and Fortran compilers support automatic vectorization of floating-point loops using AVX instructions. The C/C++ compiler also supports AVX-based intrinsics (via the header file immintrin.h) and inline assembly. Intel AVX allows the vectorization of a wider variety of floating point loops than Intel SSE, with a greater potential performance gain due to the greater width of the SIMD registers. The vectorizer is enabled automatically by the switches listed above. To see which loops have been vectorized, use the switch /Qvec-report1 (windows) or –vec-report1 (Linux).<br /><br />Pending availability of processors supporting Intel AVX, the Intel® Software Development Emulator (Intel® SDE) is available for testing programs built for Intel AVX. See <a href="http://software.intel.com/en-us/articles/intel-software-development-emulator/">http://software.intel.com/en-us/articles/intel-software-development-emulator/</a> .<br />Further general information about the Intel Compilers for C/C++ and Fortran is available at <a href="http://software.intel.com/en-us/intel-compilers/">http://software.intel.com/en-us/intel-compilers/</a> . Further information about compiler support for Intel AVX may be found in the Intel C++ Compiler User and Reference Guides, for example in the section 'Intrinsics for Advanced Vector Extensions', accessible online at <a href="http://software.intel.com/sites/products/documentation/hpc/compilerpro/en-us/cpp/win/compiler_c/index.htm">http://software.intel.com/sites/products/documentation/hpc/compilerpro/en-us/cpp/win/compiler_c/index.htm</a> .</div> ]]></description>
      <link>http://software.intel.com/en-us/articles/how-to-compile-for-intel-avx</link>
      <pubDate>Thu, 16 Jul 2009 16:34:04 -0700</pubDate>
      <comments>http://software.intel.com/en-us/articles/how-to-compile-for-intel-avx#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/how-to-compile-for-intel-avx</guid>
      <category>Intel® C++ Compiler for Linux* Knowledge Base</category>
      <category>Intel® C++ Compiler for Windows* Knowledge Base</category>
      <category>Intel® Fortran Compiler for Linux* Knowledge Base</category>
      <category>Intel® Visual Fortran Compiler for Windows* Knowledge Base</category>
    </item>
    <item>
      <title>Performance Tools for Software Developers - Loop blocking</title>
      <description><![CDATA[ <p><b>Loop blocking</b> is a combination of strip mining and loop interchange to enhance reuse of local data. It helps the nested loops that manipulate arrays and are too large to fit into the cache. The loop blocking allows reuse of the arrays by transforming the loops such that the transformed loops manipulate array strips that fit into the cache. In effect, a blocked loop uses array elements in sections that are optimally sized to fit in the cache.</p>
<p> </p>
<p>Use cache <b>blocking</b> to arrange a <b>loop</b> so it will perform as many computations as possible on data already residing in cache. (The next <b>block</b> of data is not read into cache until computations using the first <b>block</b> are finished.)</p>
<p>The loop blocking optimization is part of HLO phase in Intel compiler and is available when using compiler option <span style="mso-bidi-font-family: 'Courier New'; mso-ansi-language: EN;" lang="EN">-O3</span>. The compiler uses default heuristics for loop blocking. But you may also use /Qopt-block-factor:n in Windows or -opt-block-factor:n in Linux to specify loop blocking factor.</p>
<p><b>Data reuse:</b></p>
<p>Data reuse is important to understand blocking. There are two types of data reuse associated with loop blocking:</p>
<ul>
<li>Spatial reuse </li>
<li>Temporal reuse</li>
</ul>
<p> </p>
<p><b>Spatial reuse</b></p>
<p>Spatial reuse uses data that was encached as a result of fetching another piece of data from memory. The data is fetched one cache lines at a time. This is 64 bytes for Intel(R) Core2 processors. If the requested data is located at the beginning of the cache line (aligned data), and the rest of the cache line contains subsequent array elements then for float array, this means the requested element and the seven following elements are cached on each fetch after the first. If any of these seven elements could then be used on any subsequent iterations of the loop, the loop would be exploiting spatial reuse. For loops with strides greater than one, spatial reuse can still occur. However, the cache lines contain fewer usable elements.</p>
<p><b>Temporal reuse</b></p>
<p>Temporal reuse uses the same data item in more than one iteration of the loop. If the loop uses the same element in subsequent loop iterations then loop exhibits temporal reuse in the context of the loop. The blocking exploits spatial reuse by ensuring that once fetched, cache lines are not overwritten until their spatial reuse is exhausted.</p>
<p><b>Example 1: Simple Loop Blocking</b></p>
<p>The following example demonstrates the simple loop blocking. The <b>loop blocking</b> allows arrays A and B to be <b>blocked</b> into smaller rectangular chunks so that the total combined size of two <b>blocked</b> (A and B) chunks is smaller than cache size, which can improve data reuse.</p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto;"><span style="font-size: 9.5pt; color: black; font-family: Arial;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: green; font-family: 'Courier New'; mso-no-proof: yes;">// before_loopblocking.cpp</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: green; font-family: 'Courier New'; mso-no-proof: yes;">/*</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: green; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-spacerun: yes;"> </span>*<span style="mso-spacerun: yes;"> </span>icl /Qoption,link,"/STACK:1000000000" before_loopblocking.cpp</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: green; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-spacerun: yes;"> </span>*/</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: green; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: blue; font-family: 'Courier New'; mso-no-proof: yes;">#include</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> <span style="color: maroon;">&lt;time.h&gt;</span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: blue; font-family: 'Courier New'; mso-no-proof: yes;">#include</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> <span style="color: maroon;">&lt;stdio.h&gt;</span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: maroon; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: blue; font-family: 'Courier New'; mso-no-proof: yes;">#define</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> MAX 8000</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: blue; font-family: 'Courier New'; mso-no-proof: yes;">void</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> add(<span style="color: blue;">int</span> a[][MAX], <span style="color: blue;">int</span> b[][MAX]);</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: blue; font-family: 'Courier New'; mso-no-proof: yes;">int</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> main()</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;">{</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: blue;">int</span> i, j;</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: blue;">int</span> A[MAX][MAX];</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: blue;">int</span> B[MAX][MAX];</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span>clock_t<span style="mso-spacerun: yes;"> </span>before, after;</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: green;">//Initialize array</span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: blue;">for</span>(i=0;i&lt;MAX;i++) </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>{</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 2;"> </span><span style="mso-spacerun: yes;"> </span><span style="color: blue;">for</span>(j=0;j&lt;MAX; j++)</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 2;"> </span><span style="mso-spacerun: yes;"> </span>{ </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 4;"> </span>A[i][j]=j;</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 4;"> </span>B[i][j]=j;</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 2;"> </span><span style="mso-spacerun: yes;"> </span>}</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>}</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>before = clock();</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>add(A, B);</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>add(A, B);</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>add(A, B);</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>add(A, B);</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>after = clock();</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>printf(<span style="color: maroon;">"\nTime taken to complete : %7.2lf secs\n"</span>, (<span style="color: blue;">float</span>)(after - before)/ CLOCKS_PER_SEC); <span style="color: green;">//List time taken to complete add function</span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;">}</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: blue; font-family: 'Courier New'; mso-no-proof: yes;">void</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> add(<span style="color: blue;">int</span> a[][MAX], <span style="color: blue;">int</span> b[][MAX])</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;">{</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span><span style="color: blue;">int</span> i, j;</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span><span style="color: blue;">for</span>(i=0;i&lt;MAX;i++) </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>{</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 2;"> </span><span style="mso-spacerun: yes;"> </span><span style="color: blue;">for</span>(j=0; j&lt;MAX;j++)</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 2;"> </span><span style="mso-spacerun: yes;"> </span>{ </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 4;"> </span>a[i][j] = a[i][j] + b[j][i]; <span style="color: green;">//Adds two matrices</span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 2;"> </span><span style="mso-spacerun: yes;"> </span>}</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>}</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;">}</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto;"><span style="color: black; mso-bidi-font-family: Arial; mso-bidi-font-size: 9.5pt;"><span style="font-size: small;"><span style="font-family: Times New Roman;">The above code is modified below to enhance reuse of the cached data:</span></span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: green; font-family: 'Courier New'; mso-no-proof: yes;">// after_loopblocking.cpp</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: green; font-family: 'Courier New'; mso-no-proof: yes;">/*</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: green; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-spacerun: yes;"> </span>*<span style="mso-spacerun: yes;"> </span>icl /Qoption,link,"/STACK:1000000000" after_loopblocking.cpp</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: green; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-spacerun: yes;"> </span>*/</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: blue; font-family: 'Courier New'; mso-no-proof: yes;">#include</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> <span style="color: maroon;">&lt;stdio.h&gt;</span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: blue; font-family: 'Courier New'; mso-no-proof: yes;">#include</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> <span style="color: maroon;">&lt;time.h&gt;</span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: maroon; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: blue; font-family: 'Courier New'; mso-no-proof: yes;">#define</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> MAX 8000</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: blue; font-family: 'Courier New'; mso-no-proof: yes;">#define</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> BS 16<span style="mso-spacerun: yes;"> </span><span style="color: green;">//Block size is selected as the loop-blocking factor. </span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: green; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: blue; font-family: 'Courier New'; mso-no-proof: yes;">void</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> add(<span style="color: blue;">int</span> a[][MAX], <span style="color: blue;">int</span> b[][MAX]);</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: blue; font-family: 'Courier New'; mso-no-proof: yes;">int</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> main()</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;">{</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: blue;">int</span> i, j;</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: blue;">int</span> A[MAX][MAX];</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: blue;">int</span> B[MAX][MAX];</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span>clock_t<span style="mso-spacerun: yes;"> </span>before, after;</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: green;">//Initialize array</span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: blue;">for</span>(i=0;i&lt;MAX;i++) </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>{</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 2;"> </span><span style="mso-spacerun: yes;"> </span><span style="color: blue;">for</span>(j=0;j&lt;MAX; j++)</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 2;"> </span><span style="mso-spacerun: yes;"> </span>{ </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 4;"> </span>A[i][j]=j;</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 4;"> </span>B[i][j]=j;</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 2;"> </span><span style="mso-spacerun: yes;"> </span>}</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>}</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>before = clock();</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>add(A, B);</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>add(A, B);</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>add(A, B);</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>add(A, B);</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>after = clock();</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>printf(<span style="color: maroon;">"\nTime taken to complete : %7.2lf secs\n"</span>, (<span style="color: blue;">float</span>)(after - before)/ CLOCKS_PER_SEC); <span style="color: green;">//List time taken to complete add function</span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: green; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;">}</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: blue; font-family: 'Courier New'; mso-no-proof: yes;">void</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> add(<span style="color: blue;">int</span> a[][MAX], <span style="color: blue;">int</span> b[][MAX])</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;">{</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: blue;">int</span> i, j, ii, jj;</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: blue;">for</span>(i=0;i&lt;MAX;i+=BS) </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span>{</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 2;"> </span><span style="mso-spacerun: yes;"> </span><span style="color: blue;">for</span>(j=0; j&lt;MAX;j+=BS)</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 2;"> </span><span style="mso-spacerun: yes;"> </span>{ </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 3;"> </span><span style="mso-spacerun: yes;"> </span><span style="color: blue;">for</span>(ii=i; ii&lt;i+BS; ii++)<span style="color: green;">//outer loop</span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 3;"> </span><span style="mso-spacerun: yes;"> </span>{</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 3;"> </span><span style="mso-tab-count: 1;"> </span><span style="color: blue;">for</span>(jj=j; jj&lt;j+BS; jj++)</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt 2in; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;">{<span style="mso-spacerun: yes;"> </span><span style="mso-spacerun: yes;"> </span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt 2in; text-indent: 0.5in; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: green; font-family: 'Courier New'; mso-no-proof: yes;">//Array B experiences one cache miss</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt 2in; text-indent: 0.5in; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: green; font-family: 'Courier New'; mso-no-proof: yes;">//for every iteration of outer loop</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 5;"> </span>a[ii][jj] = a[ii][jj] + b[jj][ii];<span style="mso-tab-count: 5;"> </span><span style="mso-tab-count: 1;"> </span>}</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 3;"> </span><span style="mso-spacerun: yes;"> </span>}</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 2;"> </span><span style="mso-spacerun: yes;"> </span>}</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>}</span></p>
<p class="MsoNormal" style="margin: 3pt 0in 9pt;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;">}</span></p>
<p class="MsoNormal" style="margin: 3pt 0in 9pt;"><span style="font-size: 9.5pt; color: black; font-family: Arial; mso-ansi-language: EN;" lang="EN"> </span></p>
<p><b>Example 2: Complex Blocking</b></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: green; font-family: 'Courier New'; mso-no-proof: yes;">// matrixMul.cpp</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: green; font-family: 'Courier New'; mso-no-proof: yes;">/*</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: green; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-spacerun: yes;"> </span>*<span style="mso-spacerun: yes;"> </span>icl /Qoption,link,"/STACK:1000000000" matrixMul.cpp</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: green; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-spacerun: yes;"> </span>*/</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: blue; font-family: 'Courier New'; mso-no-proof: yes;">#include</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> <span style="color: maroon;">&lt;stdio.h&gt;</span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: blue; font-family: 'Courier New'; mso-no-proof: yes;">#include</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> <span style="color: maroon;">&lt;time.h&gt;</span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: maroon; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: blue; font-family: 'Courier New'; mso-no-proof: yes;">#define</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> MAX 800</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: blue; font-family: 'Courier New'; mso-no-proof: yes;">void</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> matmul(<span style="color: blue;">int</span> c[][MAX], <span style="color: blue;">int</span> a[][MAX], <span style="color: blue;">int</span> b[][MAX]);</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: blue; font-family: 'Courier New'; mso-no-proof: yes;">int</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> main()</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;">{</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: blue;">int</span> i, j;</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: blue;">int</span> A[MAX][MAX];</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: blue;">int</span> B[MAX][MAX];</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: blue;">int</span> C[MAX][MAX];</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span>clock_t<span style="mso-spacerun: yes;"> </span>before, after;</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: green;">//Initialize array</span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: blue;">for</span>(i=0;i&lt;MAX;i++) </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>{</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 2;"> </span><span style="mso-spacerun: yes;"> </span><span style="color: blue;">for</span>(j=0;j&lt;MAX; j++)</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 2;"> </span><span style="mso-spacerun: yes;"> </span>{ </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 4;"> </span>A[i][j]=j;</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 4;"> </span>B[i][j]=j;</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 2;"> </span><span style="mso-spacerun: yes;"> </span>}</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>}</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>before = clock();</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>matmul(C, A, B);</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>after = clock();</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>printf(<span style="color: maroon;">"\nTime taken to complete : %7.2lf secs\n"</span>, (<span style="color: blue;">float</span>)(after - before)/ CLOCKS_PER_SEC); <span style="color: green;">//List time taken to complete add function</span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: green; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;">}</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: blue; font-family: 'Courier New'; mso-no-proof: yes;">void</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> matmul(<span style="color: blue;">int</span> c[][MAX], <span style="color: blue;">int</span> a[][MAX], <span style="color: blue;">int</span> b[][MAX])</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;">{</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: blue;">int</span> i, j, k;</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: blue;">for</span>(i=0;i&lt;MAX;i++) </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span>{</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 2;"> </span><span style="mso-spacerun: yes;"> </span><span style="color: blue;">for</span>(j=0; j&lt;MAX;j++)</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 2;"> </span><span style="mso-spacerun: yes;"> </span>{ </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 3;"> </span><span style="mso-spacerun: yes;"> </span><span style="color: blue;">for</span>(k=0; k &lt; MAX; k++)</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 3;"> </span><span style="mso-spacerun: yes;"> </span></span><span style="font-size: 10pt; font-family: 'Courier New'; mso-ansi-language: IT; mso-no-proof: yes;" lang="IT">{ </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-ansi-language: IT; mso-no-proof: yes;" lang="IT"><span style="mso-tab-count: 5;"> </span>c[i][j] = c[i][j] + a[i][k] * b[k][j]; </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-ansi-language: IT; mso-no-proof: yes;" lang="IT"><span style="mso-tab-count: 3;"> </span><span style="mso-spacerun: yes;"> </span></span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;">}</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 2;"> </span><span style="mso-spacerun: yes;"> </span>}</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>}</span></p>
<p class="MsoNormal" style="margin: 3pt 0in 9pt;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;">}</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-margin-top-alt: auto; mso-margin-bottom-alt: auto;"><span style="font-size: small;"><span style="font-family: Times New Roman;"><span style="color: black; mso-bidi-font-family: Arial; mso-bidi-font-size: 9.5pt;">The above code is modified below to enhance </span><span style="color: black; mso-bidi-font-family: Arial; mso-ansi-language: EN; mso-bidi-font-size: 9.5pt;" lang="EN">spatial</span><span style="color: black; mso-bidi-font-family: Arial; mso-bidi-font-size: 9.5pt;"> and </span><span style="color: black; mso-bidi-font-family: Arial; mso-ansi-language: EN; mso-bidi-font-size: 9.5pt;" lang="EN">temporal</span><span style="color: black; mso-bidi-font-family: Arial; mso-bidi-font-size: 9.5pt;"> reuse of the cached data for array a, b and c:</span></span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: green; font-family: 'Courier New'; mso-no-proof: yes;">// matrixMulBlk.cpp</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: green; font-family: 'Courier New'; mso-no-proof: yes;">/*</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: green; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-spacerun: yes;"> </span>*<span style="mso-spacerun: yes;"> </span>icl /Qoption,link,"/STACK:1000000000" matrixMulBlk.cpp</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: green; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-spacerun: yes;"> </span>*/</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: blue; font-family: 'Courier New'; mso-no-proof: yes;">#include</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> <span style="color: maroon;">&lt;stdio.h&gt;</span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: blue; font-family: 'Courier New'; mso-no-proof: yes;">#include</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> <span style="color: maroon;">&lt;time.h&gt;</span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: maroon; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: blue; font-family: 'Courier New'; mso-no-proof: yes;">#define</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> MAX 800</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: blue; font-family: 'Courier New'; mso-no-proof: yes;">#define</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> BS 16<span style="mso-spacerun: yes;"> </span><span style="color: green;">//Block size is selected as the loop-blocking factor. </span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: green; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: blue; font-family: 'Courier New'; mso-no-proof: yes;">void</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> matmul(<span style="color: blue;">int</span> c[][MAX], <span style="color: blue;">int</span> a[][MAX], <span style="color: blue;">int</span> b[][MAX]);</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: blue; font-family: 'Courier New'; mso-no-proof: yes;">int</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> main()</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;">{</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: blue;">int</span> i, j;</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: blue;">int</span> A[MAX][MAX];</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: blue;">int</span> B[MAX][MAX];</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: blue;">int</span> C[MAX][MAX];</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span>clock_t<span style="mso-spacerun: yes;"> </span>before, after;</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: green;">//Initialize array</span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: blue;">for</span>(i=0;i&lt;MAX;i++) </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>{</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 2;"> </span><span style="mso-spacerun: yes;"> </span><span style="color: blue;">for</span>(j=0;j&lt;MAX; j++)</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 2;"> </span><span style="mso-spacerun: yes;"> </span>{ </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 4;"> </span>A[i][j]=j;</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 4;"> </span>B[i][j]=j;</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 2;"> </span><span style="mso-spacerun: yes;"> </span>}</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>}</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>before = clock();</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>matmul(C, A, B);</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>after = clock();</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>printf(<span style="color: maroon;">"\nTime taken to complete : %7.2lf secs\n"</span>, (<span style="color: blue;">float</span>)(after - before)/ CLOCKS_PER_SEC); <span style="color: green;">//List time taken to complete add function</span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: green; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;">}</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: blue; font-family: 'Courier New'; mso-no-proof: yes;">void</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> matmul(<span style="color: blue;">int</span> c[][MAX], <span style="color: blue;">int</span> a[][MAX], <span style="color: blue;">int</span> b[][MAX])</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;">{</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: blue;">int</span> i, j, k, jj, kk;</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="color: blue;">for</span>(j=0;j&lt;MAX; j += BS) </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span>{</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span><span style="color: blue;">for</span>(k=0; k&lt;MAX; k += BS)</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>{ </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span><span style="color: blue;">for</span>(i=0; i &lt; MAX; i++)</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 2;"> </span><span style="mso-spacerun: yes;"> </span>{ </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 3;"> </span><span style="color: blue;">for</span>(kk=k; kk&lt;k+BS; kk++)</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 3;"> </span>{</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt 1.5in; mso-layout-grid-align: none;"><span style="font-size: 10pt; color: blue; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-spacerun: yes;"> </span>for</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;">(jj=j; jj&lt;j+BS; jj++) </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 2;"> </span><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>{<span style="mso-spacerun: yes;"> </span></span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 4;"> </span>c[i][jj] = (c[</span><span style="font-size: 10pt; font-family: 'Courier New'; mso-ansi-language: IT; mso-no-proof: yes;" lang="IT">i][jj] + a[i][kk] * b[kk][jj]); </span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-ansi-language: IT; mso-no-proof: yes;" lang="IT"><span style="mso-tab-count: 3;"> </span><span style="mso-spacerun: yes;"> </span></span><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;">}</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 3;"> </span>}</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 2;"> </span><span style="mso-spacerun: yes;"> </span>}</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>}</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"><span style="mso-tab-count: 1;"> </span><span style="mso-spacerun: yes;"> </span>}</span></p>
<p class="MsoNormal" style="margin: 0in 0in 0pt; mso-layout-grid-align: none;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;">}</span></p>
<p class="MsoNormal" style="margin: 3pt 0in 9pt;"><span style="font-size: 10pt; font-family: 'Courier New'; mso-no-proof: yes;"> </span></p>
<!--CTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dt--> ]]></description>
      <link>http://software.intel.com/en-us/articles/performance-tools-for-software-developers-loop-blocking</link>
      <pubDate>Mon, 13 Jul 2009 15:36:15 -0700</pubDate>
      <comments>http://software.intel.com/en-us/articles/performance-tools-for-software-developers-loop-blocking#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/performance-tools-for-software-developers-loop-blocking</guid>
      <category>Intel® C++ Compiler for Linux* Knowledge Base</category>
      <category>Intel® C++ Compiler for Mac OS X* Knowledge Base</category>
      <category>Intel® C++ Compiler for Windows* Knowledge Base</category>
      <category>Intel® Parallel Composer Knowledge Base</category>
    </item>
    <item>
      <title>Performance Tools for Software Developers - Auto parallelization and  /Qpar-threshold</title>
      <description><![CDATA[ <!--CTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dt--> 
<table border="0" cellpadding="0" cellspacing="15">
<tbody>
<tr>
<td class="bodycopy">
<p>The auto-parallelization feature of the Intel C++ Compiler automatically translates serial portions of the input program into semantically equivalent multithreaded code. Automatic parallelization determines the loops that are good work sharing candidates, performs the dataflow analysis to verify correct parallel execution, and partitions the data for threaded code generation as is needed in programming with OpenMP directives. The OpenMP and Auto-parallelization applications provide the performance gains from shared memory on multiprocessor systems, IA-32, Intel 64 and Itanium processors.</p>
<p>The following table lists the options that enable Auto-parallelization:</p>
<blockquote><b>/Qparallel:</b><br />Enables the auto-parallelizer to generate multithreaded code for loops that can be safely executed in parallel. <br /><br /><b>/Qpar-threshold:n</b><br />This option sets a threshold for the auto-parallelization of loops based on the probability of profitable execution of the loop in parallel. To use this option, you must also specify -parallel (Linux and Mac OS X) or /Qparallel (Windows). The default is /Qpar-threshold:100.</blockquote>
<p>This option is useful for loops whose computation work volume cannot be determined at compile-time. The threshold is usually relevant when the loop trip count is unknown at compile-time.</p>
<p>The compiler applies a heuristic that tries to balance the overhead of creating multiple threads versus the amount of work available to be shared amongst the threads.</p>
<p>The n is an integer whose value is the threshold for the auto-parallelization of loops. Possible values are 0 through 100. If <i>n</i> is 0, loops get auto-parallelized always, regardless of computation work volume. If <i>n</i> is 100, loops get auto-parallelized when performance gains are predicted based on the compiler analysis data. Loops get auto-parallelized only if profitable parallel execution is almost certain. The intermediate 1 to 99 values represent the percentage probability for profitable speed-up. For example, <i>n</i>=50 directs the compiler to parallelize only if there is a 50% probability of the code speeding up if executed in parallel.</p>
<p>Also, to be "100%" sure that a loop will benefit from parallelization, the compiler needs to know the iteration count at compile time. For a "99%" or lower threshold, knowing the iteration count at compile time is not a requirement.</p>
<p>This leads to a big difference in the number of loops parallelized at 99% compared to 100%. For many apps, 99% is a better setting, but for some apps with a lot of short loops, 99% will slow them down.</p>
<p>The following example, int_sin.c, does not auto parallelize when we use /Qpar-threshold:100 using command line below :</p>
<blockquote>C: &gt;icl -c /Qparallel /Qpar-report3 /Qpar-threshold:100 int_sin.cquote&gt;
<p>If we use /Qpar-threshold:99 then it is parallelized.</p>
<p><b>Example:</b></p>
<p class="whs23" style="MARGIN: auto 0in 0pt"><b style="mso-bidi-font-weight: normal"></b></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; COLOR: green; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">// int_sin.c</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; COLOR: green; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">// Intel C++ compiler sample program</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"> </p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">#include</span><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"><span style="COLOR: maroon">&lt;stdio.h&gt;</span></span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">#include</span><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"><span style="COLOR: maroon">&lt;stdlib.h&gt;</span></span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">#include</span><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"><span style="COLOR: maroon">&lt;time.h&gt;</span></span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">#include</span><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"><span style="COLOR: maroon">&lt;mathimf.h&gt;</span></span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"> </p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; COLOR: green; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">// Function to be integrated</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; COLOR: green; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">// Define and prototype it here</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; COLOR: green; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">// | sin(x) |</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">#define</span><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">INTEG_FUNC(x) fabs(sin(x))</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"> </p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; COLOR: green; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">// Prototype timing function</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">double</span><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">dclock( <span style="COLOR: blue">void</span>);</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"> </p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; COLOR: blue; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">int</span><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">main( <span style="COLOR: blue">void</span>)</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">{</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"><span style="COLOR: green">// Loop counters and number of interior points</span></span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"><span style="COLOR: blue">unsigned</span><span style="COLOR: blue">int</span> i, j, N;</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"><span style="COLOR: green">// Stepsize, independent variable x, and accumulated sum</span></span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"><span style="COLOR: blue">double</span> step, x_i, sum;</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"><span style="COLOR: green">// Timing variables for evaluation </span></span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"><span style="COLOR: blue">double</span> start, finish, duration, clock_t;</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"><span style="COLOR: green">// Start integral from</span></span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"><span style="COLOR: blue">double</span> interval_begin = 0.0;</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-ali gn: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"><span style="COLOR: green">// Complete integral at</span></span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"><span style="COLOR: blue">double</span> interval_end = 2.0 * 3.141592653589793238;</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"> </p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"><span style="COLOR: green">// Start timing for the entire application</span></span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">start = clock();</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"> </p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">printf( <span style="COLOR: maroon">" "</span>);</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">printf( <span style="COLOR: maroon">" Number of | Computed Integral | "</span>);</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">printf( <span style="COLOR: maroon">" Interior Points | | "</span>);</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"><span style="COLOR: blue">for</span> (j=2;j&lt;10;j++)</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">{</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">printf( <span style="COLOR: maroon">"------------------------------------- "</span>);</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"> </p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"><span style="COLOR: green">// Compute the number of (internal rectangles + 1)</span></span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">N = 1 &lt;&lt; j;</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"> </p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"><span style="COLOR: green">// Compute stepsize for N-1 internal rectangles</span></span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">step = (interval_end - interval_begin) / N;</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"> </p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"><span style="COLOR: green">// Approx. 1/2 area in first rectangle: f(x0) * [step/2]</span></span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">sum = INTEG_FUNC(interval_begin) * step / 2.0;</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"> </p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"><span style="COLOR: green">// Apply midpoint rule:</span></span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"><span style="COLOR: green">// Given length = f(x), compute the area of the</span></span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"><span style="COLOR: green">// rectangle of width step</span></span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"><span style="COLOR: green">// Sum areas of internal rectangle: f(xi + step) * step</span></span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"> </p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"><span style="COLOR: blue">for</span> (i=1;i&lt;N;i++)</span></p>
<span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">{</span>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">x_i = i * step;</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">sum += INTEG_FUNC(x_i) * step;</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">}</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"> </p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes"><span style="COLOR: green">// Approx. 1/2 area in last rectangle: f(xN) * [step/2]</span></span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">sum += INTEG_FUNC(interval_end) * step / 2.0;</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"> </p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes; mso-ansi-language: IT" lang="IT">printf( <span style="COLOR: maroon">" %10d | %14e | "</span>, N, sum);</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">}</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">finish = clock();</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">duration = (finish - start);</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">printf( <span style="COLOR: maroon">" "</span>);</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt;  FONT-FAMILY: 'Courier New'; mso-no-proof: yes">printf( <span style="COLOR: maroon">" Application Clocks = %10e "</span>, duration);</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="FONT-SIZE: 10pt; FONT-FAMILY: 'Courier New'; mso-no-proof: yes">printf( <span style="COLOR: maroon">" "</span>);</span></p>
<p class="MsoNormal" style="MARGIN: 0in 0in 0pt; mso-layout-grid-align: none"><span style="mso-no-proof: yes"><span style="font-size: small; font-family: Times New Roman;">}</span></span></p>
</blockquote>
</td>
</tr>
</tbody>
</table>
<table border="0" cellpadding="0" cellspacing="0">
<tbody>
<tr>
<td><img src="http://software.intel.com/file/6324" height="5" width="388" /></td>
</tr>
<tr>
<td height="10"></td>
</tr>
</tbody>
</table> ]]></description>
      <link>http://software.intel.com/en-us/articles/performance-tools-for-software-developers-auto-parallelization-and-qpar-threshold</link>
      <pubDate>Mon, 13 Jul 2009 15:32:16 -0700</pubDate>
      <comments>http://software.intel.com/en-us/articles/performance-tools-for-software-developers-auto-parallelization-and-qpar-threshold#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/performance-tools-for-software-developers-auto-parallelization-and-qpar-threshold</guid>
      <category>Intel® C++ Compiler for Linux* Knowledge Base</category>
      <category>Intel® C++ Compiler for Mac OS X* Knowledge Base</category>
      <category>Intel® C++ Compiler for Windows* Knowledge Base</category>
      <category>Intel® Parallel Composer Knowledge Base</category>
    </item>
    <item>
      <title>OpenMP* Loops with Function Calls for Bounds May Not Parallelize</title>
      <description><![CDATA[ <br />
<div id="art_pre_template"><strong>Reference Number :</strong>  DPD200110877<br /><br /><br /><strong>Version :</strong> 11.0, 11.1 or Intel® Parallel Composer<br /><br /><br /><strong>Operating System : </strong>Windows*, Linux*, Mac OS X*<br /><br /><br /><strong>Problem Description : </strong>The OpenMP* 3.0 standard now supports using STL iterators for OpenMP loop bounds.  However, the Intel® C++ Compiler does not parallelize code like the following:<br /><br />
<pre name="code" class="cpp">#include &lt;vector&gt;

void iterator_example()
{
  std::vector&lt;double&gt; vec(23);
  std::vector&lt;double&gt;::iterator it;

#pragma omp parallel for default(none) shared(vec) 
  for (it = vec.begin(); it &lt; vec.end(); it++)
  {
    *it = 1.0;// do work with *it //
  }
}</pre>
<br /><br />The compiler will not give an indication (as it should) that the loop was parallelized for OpenMP*.  If you examine the code, you will see that the compiler generates a serial version of the loop.  This is because of an issue with the compiler using function calls on loop bounds that are inlined causing the compiler to not recognize the loop as being a validly formed loop for parallelization.<br /><br /><br /><strong>Resolution Status : </strong>This will be resolved in an upcoming compiler update.<br /><br /><br /><br /><em>[DISCLAIMER: The information on this web site is intended for hardware system manufacturers and software developers. Intel does not warrant the accuracy, completeness or utility of any information on this site. Intel may make changes to the information or the site at any time without notice. Intel makes no commitment to update the information at this site. ALL INFORMATION PROVIDED ON THIS WEBSITE IS PROVIDED "as is" without any express, implied, or statutory warranty of any kind including but not limited to warranties of merchantability, non-infringement of intellectual property, or fitness for any particular purpose. Independent companies manufacture the third-party products that are mentioned on this site. Intel is not responsible for the quality or performance of third-party products and makes no representation or warranty regarding such products. The third-party supplier remains solely responsible for the design, manufacture, sale and functionality of its products. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. *Other names and brands may be claimed as the property of others.]</em></div> ]]></description>
      <link>http://software.intel.com/en-us/articles/openmp-loops-with-function-calls-for-bounds-may-not-parallelize</link>
      <pubDate>Thu, 12 Mar 2009 17:06:43 -0700</pubDate>
      <comments>http://software.intel.com/en-us/articles/openmp-loops-with-function-calls-for-bounds-may-not-parallelize#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/openmp-loops-with-function-calls-for-bounds-may-not-parallelize</guid>
      <category>Intel® C++ Compiler for Linux* Knowledge Base</category>
      <category>Intel® C++ Compiler for Mac OS X* Knowledge Base</category>
      <category>Intel® C++ Compiler for Windows* Knowledge Base</category>
      <category>Intel® Parallel Composer Knowledge Base</category>
    </item>
    <item>
      <title>Disable movbe to Test Intel® Atom™ Processor Targeted Code on non-Intel® Atom™ Processor Platforms</title>
      <description><![CDATA[ <p>The Intel® Compilers 11.0 allow you to target the Intel® Atom™ processor using the /QxSSE3_ATOM or -xSSE3_ATOM compiler options.  These options enable the generation of the movbe instruction which is unique to the Intel® Atom™ processor.  However, there is sometimes a need to run such codes on a different processor such as the Intel® Pentium® III processor (for example, for validation purposes where an Intel® Atom™ processor isn't available).  In these situations, the compiler provides the /Qinstruction:nomovbe (for Windows*) and -minstruction=nomovbe (for Linux*/Mac*) options to disable the generation of this instruction.</p> ]]></description>
      <link>http://software.intel.com/en-us/articles/disable-movbe-to-test-intel-atom-targeted-code-on-non-atom-platforms</link>
      <pubDate>Fri, 20 Feb 2009 16:41:09 -0800</pubDate>
      <comments>http://software.intel.com/en-us/articles/disable-movbe-to-test-intel-atom-targeted-code-on-non-atom-platforms#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/disable-movbe-to-test-intel-atom-targeted-code-on-non-atom-platforms</guid>
      <category>Intel® C++ Compiler for Linux* Knowledge Base</category>
      <category>Intel® C++ Compiler for Mac OS X* Knowledge Base</category>
      <category>Intel® C++ Compiler for Windows* Knowledge Base</category>
      <category>Intel® Fortran Compiler for Linux* Knowledge Base</category>
      <category>Intel® Fortran Compiler for Mac OS X* Knowledge Base</category>
      <category>Intel® Parallel Composer Knowledge Base</category>
      <category>Intel® Visual Fortran Compiler for Windows* Knowledge Base</category>
    </item>
    <item>
      <title>Intel® Fortran Compiler - Training courses</title>
      <description><![CDATA[ <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html><body>
<table border="0" cellspacing="15" cellpadding="0"><tr><td class="bodycopy">
<p>Intel offers training courses designed to help software developers become productive and to improve application performance with the Intel&reg; C++ and Intel&reg; Fortran Compilers for Windows*, Linux*, and Mac* OS environments. Focus is given to software optimization on a specific processor architecture.</p>
<p>For course and registration information, visit the 
<a href="http://www.intel.com/software/college/">Intel&reg; Software College</a>.</p>
</td></tr></table>
<table border="0" cellspacing="0" cellpadding="0">
<tr><td><img src="http://software.intel.com/file/6324" width="388" height="5"></td></tr>
<tr><td height="10"></td></tr>
</table>
</body></html>
 ]]></description>
      <link>http://software.intel.com/en-us/articles/intel-fortran-compiler-training-courses</link>
      <pubDate>Fri, 19 Sep 2008 00:00:00 -0700</pubDate>
      <comments>http://software.intel.com/en-us/articles/intel-fortran-compiler-training-courses#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/intel-fortran-compiler-training-courses</guid>
      <category>Intel® C++ Compiler for Linux* Knowledge Base</category>
      <category>Intel® C++ Compiler for Mac OS X* Knowledge Base</category>
      <category>Intel® C++ Compiler for Windows* Knowledge Base</category>
      <category>Intel® Fortran Compiler for Linux* Knowledge Base</category>
      <category>Intel® Fortran Compiler for Mac OS X* Knowledge Base</category>
      <category>Intel® Visual Fortran Compiler for Windows* Knowledge Base</category>
    </item>
  </channel></rss>