<?xml version="1.0" encoding="UTF-8"?>
<!-- Generated on Sun, 12 Feb 2012 05:26:33 -0800 -->
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <atom:link href="http://software.intel.com/en-us/articles/intel-c-compiler-for-linux-kb/type/tips-and-techniques/feed/" rel="self" type="application/rss+xml" />
    <title>Intel Software Network articles Feed</title>
    <link>http://software.intel.com/en-us/articles/intel-c-compiler-for-linux-kb/type/tips-and-techniques/</link>
    <description></description>
    <language>en-us</language>
    <item>
      <title>First compile time slow down on Linux</title>
      <description><![CDATA[ <br />
<div id="art_pre_template"><b>Problem : The first time the compiler is run after a login or after not being run for several minutes, this initial compilation can take dramatically longer than subsequent compilations. Subsequent compilations are significantly faster <br /><br /></b><b>Environment : RedHat Enterprise Linux and its derivativatives</b><br /><br /><br /><b>Root Cause : Full look up of multiple directories causes timeout.</b><br /><br /><br /><b>Resolution1 : Remove as many files and directory as you can from /tmp <br /></b>The slowness of the first compilation is due to the license manager examining every file on /tmp. This can initially take several seconds as this information is not iniitally cached by the OS. To avoid long delays, remove all unnecessary files from /tmp to speed up this process. Or see Resolution 2 below to improve the speed of the 'stat' operation on /tmp.<br /> <br /><br /><strong>Resolution2 : Modify you $LS_OPTIONS environment variable to --color=none -U<br /></strong>This is one of the faster ls option settings. It will prevent you from grabbing all inode information unless you explicitly want it.<br /><br /><br /></div> ]]></description>
      <link>http://software.intel.com/en-us/articles/first-compile-time-slow-down-on-linux/</link>
      <pubDate>Tue, 24 Jan 2012 00:00:00 -0800</pubDate>
      <comments>http://software.intel.com/en-us/articles/first-compile-time-slow-down-on-linux/#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/first-compile-time-slow-down-on-linux/</guid>
      <category>Intel® C++ Compiler for Linux* Knowledge Base</category>
      <category>Intel® Fortran Compiler for Linux* Knowledge Base</category>
    </item>
    <item>
      <title>Inlining  is disabled by -pg instrumentation for gprof</title>
      <description><![CDATA[ The Intel Compiler for Linux supports the option -pg. This instruments the binary to allow function level profiling using gprof. To do this, it also disables function inlining, which may result in some loss of performance. This consequence of -pg is not documented in version 12.1 of the Intel Compiler for Linux, but will be documented in future versions.<br />          For performance analysis and profiling of applications without impacting inlining, Intel(R) VTune(TM) Amplifier XE may be used. ]]></description>
      <link>http://software.intel.com/en-us/articles/inlining-is-disabled-by-pg-instrumentation-for-gprof/</link>
      <pubDate>Fri, 20 Jan 2012 00:00:00 -0800</pubDate>
      <comments>http://software.intel.com/en-us/articles/inlining-is-disabled-by-pg-instrumentation-for-gprof/#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/inlining-is-disabled-by-pg-instrumentation-for-gprof/</guid>
      <category>Intel® C++ Compiler for Linux* Knowledge Base</category>
      <category>Intel® Fortran Compiler for Linux* Knowledge Base</category>
    </item>
    <item>
      <title>How to Automate Static Security Analysis with Intel(R) C++ Compiler for Linux*</title>
      <description><![CDATA[ <p>Automate the static security analysis check done by the Intel(R) C++ Compiler for Linux. Static security analysis is the process of finding errors and security weaknesses in software through detailed analysis of source code.<br /><br />An automated quality gate like this one can notably reduce code reviews efforts, and of course will decrease the likely of having bugs and security threats found once the product is in production. <br /><br />To automate the static security analysis as a quality gate in any project, execute the check without graphical user interface which requires human interaction.</p>
<p> </p>
<p>In the case of legacy projects, ask the developers to submit new code only if they reduce the number of findings.<br />In the case of coding from scratch, allow no findings before uploading new code in your repository.<br /><br />When enabling the check (<strong>-diag-enable sc3</strong>) and compiling the code, a new folder will be created where the findings will be stored using a custom XML format.</p>
<blockquote>
<p>$ file rXsc/data.X/rXsc.pdr<br />rXsc/data.X/rXsc.pdr: XML document text</p>
</blockquote>
<br />The xmlstar* package can be used to easily list the findings and the associated location information (file, line and function). The package provides a command line tool to process XML documents.<br /><br /><a href="http://xmlstar.sourceforge.net/">http://xmlstar.sourceforge.net</a><br /><br />The following line can be used to verify that no findings are found before proceeding with the usual development cycle. <br /><br />
<blockquote>
<p>$ xml sel -t -m /diags/diag -v "concat(message/thread/stacktrace/loc/file, ':', message/thread/stacktrace/loc/line, ':', sc_verbose)" -n rXsc/data.0/rXsc.pdr <br />/home/$USER/work/$PROD/src/pool.c:157:pool.c(157): warning #12178: this value of "ret" isn't used in the program<br />/home/$USER/work/$PROD/src/pool.c:186:pool.c(186): error #12192: unreachable statement<br />/home/$USER/work/$PROD/src/pool.c:216:pool.c(216): warning #12135: procedure "pool_done" is never caled</p>
</blockquote>
<p> </p> ]]></description>
      <link>http://software.intel.com/en-us/articles/how-to-automate-static-security-analysis-with-intelr-c-compiler-for-linux/</link>
      <pubDate>Fri, 13 Jan 2012 00:00:00 -0800</pubDate>
      <comments>http://software.intel.com/en-us/articles/how-to-automate-static-security-analysis-with-intelr-c-compiler-for-linux/#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/how-to-automate-static-security-analysis-with-intelr-c-compiler-for-linux/</guid>
      <category>Tools</category>
      <category>Intel Software Network communities</category>
      <category>Intel® C++ Compiler for Linux* Knowledge Base</category>
      <category>Resources For Software Developers</category>
    </item>
    <item>
      <title>The C++  compiler error when doing static_cast </title>
      <description><![CDATA[ <br />
<div id="art_pre_template"><b>Problem : </b><br />We get compiler error when doing static_cast of "unsigned long *" to "const unsigned short &amp;" with Intel compiler 11.x. This is demonstrated using the following code segment:<br /><br />$ cat tstcase.cpp<br />int main()<br />{<br />unsigned long long_int = 0x13579bdf2458ace0;<br />unsigned short short1 = static_cast&lt;unsigned short&gt;(long_int);<br />unsigned short short2 = static_cast&lt;const unsigned short&amp;&gt;(long_int);<br />unsigned short short3 = static_cast&lt;const unsigned short&amp;&gt;(static_cast&lt;const unsigned long&amp;&gt;(long_int));<br />unsigned short short4 = reinterpret_cast&lt;const unsigned short&amp;&gt;(long_int);<br />unsigned short short5 = reinterpret_cast&lt;const unsigned short&amp;&gt;(static_cast&lt;const unsigned long&amp;&gt;(long_int));<br /><br />return 0;<br />}<br /><br />$ icc -c tstcase.cpp<br />tstcase.cpp(5): error: invalid type conversion: "unsigned long *" to "const unsigned short &amp;"<br />unsigned short short2 = static_cast&lt;const unsigned short&amp;&gt;(long_int);<br />^<br /><br />tstcase.cpp(6): error: invalid type conversion: "const unsigned long *" to "const unsigned short &amp;"<br />unsigned short short3 = static_cast&lt;const unsigned short&amp;&gt;(static_cast&lt;const unsigned long&amp;&gt;(long_int));<br />^<br /><br /><br /><br /><b>Environment : </b><br /><br />Linux, Intel C++ compiler<br /><br /><b>Resolution : </b><br /><br />The issue has been fixed in Intel compiler 12.x. The Intel compiler 12.0 is part of Intel composer XE and is available for download from Intel registration and download center https://registrationcenter.intel.com/<br /></div> ]]></description>
      <link>http://software.intel.com/en-us/articles/the-c-compiler-error-when-doing-static_cast/</link>
      <pubDate>Sun, 20 Mar 2011 11:30:00 -0700</pubDate>
      <comments>http://software.intel.com/en-us/articles/the-c-compiler-error-when-doing-static_cast/#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/the-c-compiler-error-when-doing-static_cast/</guid>
      <category>Intel® C++ Compiler for Linux* Knowledge Base</category>
      <category>Intel® C++ Compiler for Mac OS X* Knowledge Base</category>
    </item>
    <item>
      <title>Performance Tools for Software Developers - Auto parallelization and  /Qpar-threshold</title>
      <description><![CDATA[ <!--CTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dt-->
<table cellpadding="0" cellspacing="15" border="0">
<tbody>
<tr>
<td class="bodycopy">
<p>The auto-parallelization feature of the Intel C++ Compiler automatically translates serial portions of the input program into semantically equivalent multithreaded code. Automatic parallelization determines the loops that are good work sharing candidates, performs the dataflow analysis to verify correct parallel execution, and partitions the data for threaded code generation as is needed in programming with OpenMP directives. The OpenMP and Auto-parallelization applications provide the performance gains from shared memory on multiprocessor systems, IA-32 and  Intel 64.</p>
<p>The following table lists the options that enable Auto-parallelization:</p>
<blockquote><b>/Qparallel:</b><br />Enables the auto-parallelizer to generate multithreaded code for loops that can be safely executed in parallel. <br /><br /><b>/Qpar-threshold:n</b><br />This option sets a threshold for the auto-parallelization of loops based on the probability of profitable execution of the loop in parallel. To use this option, you must also specify -parallel (Linux and Mac OS X) or /Qparallel (Windows). The default is /Qpar-threshold:100.</blockquote>
<p>This option is useful for loops whose computation work volume cannot be determined at compile-time. The threshold is usually relevant when the loop trip count is unknown at compile-time.</p>
<p>The compiler applies a heuristic that tries to balance the overhead of creating multiple threads versus the amount of work available to be shared amongst the threads.</p>
<p>The n is an integer whose value is the threshold for the auto-parallelization of loops. Possible values are 0 through 100. If <i>n</i> is 0, loops get auto-parallelized always, regardless of computation work volume. If <i>n</i> is 100, loops get auto-parallelized when performance gains are predicted based on the compiler analysis data. Loops get auto-parallelized only if profitable parallel execution is almost certain. The intermediate 1 to 99 values represent the percentage probability for profitable speed-up. For example, <i>n</i>=50 directs the compiler to parallelize only if there is a 50% probability of the code speeding up if executed in parallel.</p>
<p>Also, to be "100%" sure that a loop will benefit from parallelization, the compiler needs to know the iteration count at compile time. For a "99%" or lower threshold, knowing the iteration count at compile time is not a requirement.</p>
<p>This leads to a big difference in the number of loops parallelized at 99% compared to 100%. For many apps, 99% is a better setting, but for some apps with a lot of short loops, 99% will slow them down.</p>
<p>The following example, int_sin.c, does not auto parallelize when we use /Qpar-threshold:100 using command line below :</p>
<blockquote>C: &gt;icl -c /Qparallel /Qpar-report3 /Qpar-threshold:100 int_sin.c
<p>If we use /Qpar-threshold:99 then it is parallelized.</p>
<p><b>Example:</b></p>
<p class="whs23" ><b ></b></p>
<p class="MsoNormal" ><span >// int_sin.c</span></p>
<p class="MsoNormal" ><span >// Intel C++ compiler sample program</span></p>
<p class="MsoNormal" > </p>
<p class="MsoNormal" ><span >#include</span><span ><span >&lt;stdio.h&gt;</span></span></p>
<p class="MsoNormal" ><span >#include</span><span ><span >&lt;stdlib.h&gt;</span></span></p>
<p class="MsoNormal" ><span >#include</span><span ><span >&lt;time.h&gt;</span></span></p>
<p class="MsoNormal" ><span >#include</span><span ><span >&lt;mathimf.h&gt;</span></span></p>
<p class="MsoNormal" > </p>
<p class="MsoNormal" ><span >// Function to be integrated</span></p>
<p class="MsoNormal" ><span >// Define and prototype it here</span></p>
<p class="MsoNormal" ><span >// | sin(x) |</span></p>
<p class="MsoNormal" ><span >#define </span><span >INTEG_FUNC(x) fabs(sin(x))</span></p>
<p class="MsoNormal" > </p>
<p class="MsoNormal" ><span >// Prototype timing function</span></p>
<p class="MsoNormal" ><span >double </span><span >dclock( <span >void</span>);</span></p>
<p class="MsoNormal" > </p>
<p class="MsoNormal" ><span >int </span><span >main( <span >void</span>)</span></p>
<p class="MsoNormal" ><span >{</span></p>
<p class="MsoNormal" ><span ><span >// Loop counters and number of interior points</span></span></p>
<p class="MsoNormal" ><span ><span >unsigned </span><span >int</span> i, j, N;</span></p>
<p class="MsoNormal" ><span ><span >// Stepsize, independent variable x, and accumulated sum</span></span></p>
<p class="MsoNormal" ><span ><span >double</span> step, x_i, sum;</span></p>
<p class="MsoNormal" ><span ><span >// Timing variables for evaluation </span></span></p>
<p class="MsoNormal" ><span ><span >double</span> start, finish, duration, clock_t;</span></p>
<p class="MsoNormal" ><span ><span >// Start integral from</span></span></p>
<p class="MsoNormal" ><span ><span >double</span> interval_begin = 0.0;</span></p>
<p class="MsoNormal" ><span ><span >// Complete integral at</span></span></p>
<p class="MsoNormal" ><span ><span >double</span> interval_end = 2.0 * 3.141592653589793238;</span></p>
<p class="MsoNormal" > </p>
<p class="MsoNormal" ><span ><span >// Start timing for the entire application</span></span></p>
<p class="MsoNormal" ><span >start = clock();</span></p>
<p class="MsoNormal" > </p>
<p class="MsoNormal" ><span >printf( <span >" "</span>);</span></p>
<p class="MsoNormal" ><span >printf( <span >" Number of | Computed Integral | "</span>);</span></p>
<p class="MsoNormal" ><span >printf( <span >" Interior Points | | "</span>);</span></p>
<p class="MsoNormal" ><span ><span >for</span> (j=2;j&lt;10;j++)</span></p>
<p class="MsoNormal" ><span >{</span></p>
<p class="MsoNormal" ><span >printf( <span >"------------------------------------- "</span>);</span></p>
<p class="MsoNormal" > </p>
<p class="MsoNormal" ><span ><span >// Compute the number of (internal rectangles + 1)</span></span></p>
<p class="MsoNormal" ><span >N = 1 &lt;&lt; j;</span></p>
<p class="MsoNormal" > </p>
<p class="MsoNormal" ><span ><span >// Compute stepsize for N-1 internal rectangles</span></span></p>
<p class="MsoNormal" ><span >step = (interval_end - interval_begin) / N;</span></p>
<p class="MsoNormal" > </p>
<p class="MsoNormal" ><span ><span >// Approx. 1/2 area in first rectangle: f(x0) * [step/2]</span></span></p>
<p class="MsoNormal" ><span >sum = INTEG_FUNC(interval_begin) * step / 2.0;</span></p>
<p class="MsoNormal" > </p>
<p class="MsoNormal" ><span ><span >// Apply midpoint rule:</span></span></p>
<p class="MsoNormal" ><span ><span >// Given length = f(x), compute the area of the</span></span></p>
<p class="MsoNormal" ><span ><span >// rectangle of width step</span></span></p>
<p class="MsoNormal" ><span ><span >// Sum areas of internal rectangle: f(xi + step) * step</span></span></p>
<p class="MsoNormal" > </p>
<p class="MsoNormal" ><span ><span >for</span> (i=1;i&lt;N;i++)</span></p>
<span >{</span>
<p class="MsoNormal" ><span >x_i = i * step;</span></p>
<p class="MsoNormal" ><span >sum += INTEG_FUNC(x_i) * step;</span></p>
<p class="MsoNormal" ><span >}</span></p>
<p class="MsoNormal" > </p>
<p class="MsoNormal" ><span ><span >// Approx. 1/2 area in last rectangle: f(xN) * [step/2]</span></span></p>
<p class="MsoNormal" ><span >sum += INTEG_FUNC(interval_end) * step / 2.0;</span></p>
<p class="MsoNormal" > </p>
<p class="MsoNormal" ><span lang="IT" >printf( <span >" %10d | %14e | "</span>, N, sum);</span></p>
<p class="MsoNormal" ><span >}</span></p>
<p class="MsoNormal" ><span >finish = clock();</span></p>
<p class="MsoNormal" ><span >duration = (finish - start);</span></p>
<p class="MsoNormal" ><span >printf( <span >" "</span>);</span></p>
<p class="MsoNormal" ><span >printf( <span >" Application Clocks = %10e "</span>, duration);</span></p>
<p class="MsoNormal" ><span >printf( <span >" "</span>);</span></p>
<p class="MsoNormal" ><span ><span >}</span></span></p>
</blockquote>
</td>
</tr>
</tbody>
</table>
<table cellpadding="0" cellspacing="0" border="0">
<tbody>
<tr>
<td><img height="5" width="388" src="http://software.intel.com/file/6324" /></td>
</tr>
<tr>
<td height="10"></td>
</tr>
</tbody>
</table> ]]></description>
      <link>http://software.intel.com/en-us/articles/performance-tools-for-software-developers-auto-parallelization-and-qpar-threshold/</link>
      <pubDate>Sun, 23 Jan 2011 10:30:00 -0800</pubDate>
      <comments>http://software.intel.com/en-us/articles/performance-tools-for-software-developers-auto-parallelization-and-qpar-threshold/#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/performance-tools-for-software-developers-auto-parallelization-and-qpar-threshold/</guid>
      <category>Intel® C++ Compiler for Linux* Knowledge Base</category>
      <category>Intel® C++ Compiler for Mac OS X* Knowledge Base</category>
      <category>Intel® C++ Compiler for Windows* Knowledge Base</category>
      <category>Intel® Parallel Composer Knowledge Base</category>
    </item>
    <item>
      <title>How to find active speech level (audio level) using Intel® IPP function?</title>
      <description><![CDATA[ <p>Using the following piece of code, one can find audio level.</p>
<pre name="code" class="cpp">{
      Ipp32f tmpMinS,tmpMaxS;

      Ipp32f m_PeakAmpdB;

       ippsMinMax_32f(fileData, sizeSamples, &amp;tmpMinS, &amp;tmpMaxS);

      Ipp32f maxAbsSample = IPP_MAX(fabs(tmpMinS),fabs(tmpMaxS));

      if(maxAbsSample &gt; 0) {

            m_PeakAmpdB = 20.f * log10(maxAbsSample / 32768.f);

      } else {

      m_PeakAmpdB = -91.f;

      }

}

</pre>
<p><br />In this code, Amp(dB) = 20 lg(S/N), where S- random signal, N - noise. If S=1 then Amp(dB) = 20 lg(1/N) = -91 – this is silence.</p> ]]></description>
      <link>http://software.intel.com/en-us/articles/how-to-find-active-speech-level-audio-level-using-intel-ipp-function/</link>
      <pubDate>Sun, 23 Jan 2011 10:30:00 -0800</pubDate>
      <comments>http://software.intel.com/en-us/articles/how-to-find-active-speech-level-audio-level-using-intel-ipp-function/#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/how-to-find-active-speech-level-audio-level-using-intel-ipp-function/</guid>
      <category>Intel® C++ Compiler for Linux* Knowledge Base</category>
      <category>Intel® C++ Compiler for Mac OS X* Knowledge Base</category>
      <category>Intel® C++ Compiler for Windows* Knowledge Base</category>
      <category>Intel® Software Development Tool Suites for Intel® Atom™ Processor Knowledge Base</category>
      <category>Intel® Integrated Performance Primitives Knowledge Base</category>
      <category>Intel® Parallel Composer Knowledge Base</category>
    </item>
    <item>
      <title>Description of PARDISO errors and messages</title>
      <description><![CDATA[ <p class="MsoNormal"><span lang="EN-US" >See the table below for the description of the error indicator.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" ><o:p></o:p></span></p>
<table cellpadding="0" cellspacing="0" border="1" class="MsoTableGrid" >
<tbody>
<tr >
<td width="111" valign="top" >
<p class="MsoNormal"><b><i><span lang="EN-US" >Error </span></i></b><i><span lang="EN-US" >( Integer)</span></i><span lang="EN-US" ><o:p></o:p></span></p>
</td>
<td width="527" valign="top" >
<p class="MsoNormal"><b><span lang="EN-US" >Information</span></b><span lang="EN-US" ><o:p></o:p></span></p>
</td>
</tr>
<tr >
<td width="111" valign="top" >
<p class="MsoNormal"><span lang="EN-US" >0</span><span lang="EN-US" ><o:p></o:p></span></p>
</td>
<td width="527" valign="top" >
<p class="MsoNormal"><span lang="EN-US" >no error</span><span lang="EN-US" ><o:p></o:p></span></p>
</td>
</tr>
<tr >
<td width="111" valign="top" >
<p class="MsoNormal"><span lang="EN-US" >-1</span><span lang="EN-US" ><o:p></o:p></span></p>
</td>
<td width="527" valign="top" >
<p class="MsoNormal"><span lang="EN-US" >input inconsistency<o:p></o:p></span></p>
</td>
</tr>
<tr >
<td width="111" valign="top" >
<p class="MsoNormal"><span lang="EN-US" >-2</span><span lang="EN-US" ><o:p></o:p></span></p>
</td>
<td width="527" valign="top" >
<p class="MsoNormal"><span lang="EN-US" >not enough memory<o:p></o:p></span></p>
</td>
</tr>
<tr >
<td width="111" valign="top" >
<p class="MsoNormal"><span lang="EN-US" >-3</span><span lang="EN-US" ><o:p></o:p></span></p>
</td>
<td width="527" valign="top" >
<p class="MsoNormal"><span lang="EN-US" >reordering problem<o:p></o:p></span></p>
</td>
</tr>
<tr >
<td width="111" valign="top" >
<p class="MsoNormal"><span lang="EN-US" >-4<o:p></o:p></span></p>
</td>
<td width="527" valign="top" >
<p class="MsoNormal"><span lang="EN-US" >zero pivot, numerical factorization or<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" >iterative refinement problem</span><span lang="EN-US" ><o:p></o:p></span></p>
</td>
</tr>
<tr >
<td width="111" valign="top" >
<p class="MsoNormal"><span lang="EN-US" >-5</span><span lang="EN-US" ><o:p></o:p></span></p>
</td>
<td width="527" valign="top" >
<p class="MsoNormal"><span lang="EN-US" >unclassified (internal) error<o:p></o:p></span></p>
</td>
</tr>
<tr >
<td width="111" valign="top" >
<p class="MsoNormal"><span lang="EN-US" >-6</span><span lang="EN-US" ><o:p></o:p></span></p>
</td>
<td width="527" valign="top" >
<p class="MsoNormal"><span lang="EN-US" >preordering failed (matrix types 11, 13 only)<o:p></o:p></span></p>
</td>
</tr>
<tr >
<td width="111" valign="top" >
<p class="MsoNormal"><span lang="EN-US" >-7<o:p></o:p></span></p>
</td>
<td width="527" valign="top" >
<p class="MsoNormal"><span lang="EN-US" >diagonal matrix is singular<o:p></o:p></span></p>
</td>
</tr>
<tr >
<td width="111" valign="top" >
<p class="MsoNormal"><span lang="EN-US" >-8<o:p></o:p></span></p>
</td>
<td width="527" valign="top" >
<p class="MsoNormal"><span lang="EN-US" >32-bit integer overflow problem<o:p></o:p></span></p>
</td>
</tr>
<tr >
<td width="111" valign="top" >
<p class="MsoNormal"><span lang="EN-US" >-9<o:p></o:p></span></p>
</td>
<td width="527" valign="top" >
<p class="MsoNormal"><span lang="EN-US" >not enough memory for OOC<o:p></o:p></span></p>
</td>
</tr>
<tr >
<td width="111" valign="top" >
<p class="MsoNormal"><span lang="EN-US" >-10<o:p></o:p></span></p>
</td>
<td width="527" valign="top" >
<p class="MsoNormal"><span lang="EN-US" >problems with opening OOC temporary files<o:p></o:p></span></p>
</td>
</tr>
<tr >
<td width="111" valign="top" >
<p class="MsoNormal"><span lang="EN-US" >-11<o:p></o:p></span></p>
</td>
<td width="527" valign="top" >
<p class="MsoNormal"><span lang="EN-US" >read/write problems with the OOC data file<o:p></o:p></span></p>
</td>
</tr>
</tbody>
</table>
<p class="MsoNormal"><span lang="EN-US" ><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" ><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" >Below each error is described in details:<o:p></o:p></span></p>
<p class="MsoNormal"><b><span lang="EN-US" ><o:p></o:p></span></b></p>
<p class="MsoNormal"><b ><i ><span lang="EN-US" ><span ></span>0: <span ></span></span></i></b><b ><i ><span lang="EN-US" >no error</span></i></b><b ><i ><span lang="EN-US" ><o:p></o:p></span></i></b></p>
<p class="MsoNormal"><b ><i ><span lang="EN-US" ><o:p></o:p></span></i></b></p>
<p class="MsoNormal"><b ><i ><span lang="EN-US" ><span ></span>-1: <span ></span></span></i></b><b ><i ><span lang="EN-US" >Input inconsistency</span></i></b></p>
<p class="MsoNormal"><span lang="EN-US" ><o:p></o:p></span></p>
<span >This error can appear in the following situations:</span><br />
<ul>
</ul>
<p class="MsoNormal"><span lang="EN-US" ><o:p></o:p></span></p>
<ul>
<li>
<div class="MsoListParagraphCxSpFirst"><span lang="EN-US" >Incorrect stage number for PARDISO was called.<o:p></o:p></span></div>
</li>
<li>
<div class="MsoListParagraphCxSpMiddle"><span lang="EN-US" >Incorrect PARDISO calling sequence, e.g. run stage &gt; 1 without initialization results in error reporting.<o:p></o:p></span></div>
</li>
<li><span lang="EN-US" >Incorrect number of matrices to be solved was set (PARDISO can be used for solving several matrices with the same sparsity structure at once, taking into account that their maximum number was defined previously. Setting the number of matrices to be solved outside the range of <b>1</b></span><b><samp><span lang="EN-US">≤</span></samp><span lang="EN-US" > … </span><samp><span lang="EN-US">≤</span></samp></b><span lang="EN-US" ><b>maxfct</b> results in error).</span></li>
<li>
<div class="MsoListParagraphCxSpLast"><span lang="EN-US" >PARDISO checks the parameters at each stage for consistency with the parameters at the previous stages. Every disagreement results in error reporting.<o:p></o:p></span></div>
</li>
</ul>
<p class="MsoNormal"><span lang="EN-US" ><o:p></o:p></span></p>
<p class="MsoNormal"><b ><i ><span lang="EN-US" ><span ></span></span></i></b><b ><i ><span lang="EN-US" >-2: <span ></span></span></i></b><b ><i ><span lang="EN-US" >not enough memory</span></i></b><span lang="EN-US" > <o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" ><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" >This error value is returned in the case of any problem with memory allocation inside PARDISO.<o:p></o:p></span></p>
<p class="MsoNormal"><i ><span ><span lang="EN-US" >PARDISO messages:<o:p></o:p></span></span></i></p>
<p class="MsoNormal"><span lang="EN-US" ><span ></span><i >"*** Error in PARDISO memory allocation: [STRUCTURE NAME], size to allocate: %d bytes"<o:p></o:p></i></span></p>
<p class="MsoNormal"><i ><span lang="EN-US" >"total memory wanted here: %d kbyte"<o:p></o:p></span></i></p>
<p class="MsoNormal"><i ><span lang="EN-US" >"symbolic (max): %d symbolic (permanent): %d"<o:p></o:p></span></i></p>
<p class="MsoNormal"><i ><span lang="EN-US" >"real(including 1 factor): %d"<o:p></o:p></span></i></p>
<p class="MsoNormal"><i ><span lang="EN-US" ><o:p></o:p></span></i></p>
<p class="MsoNormal"><span >It describes the issue that arises on allocation of STRUCTURE_NAME inside PARDISO. A additional information about current memory usages is also printed.</span></p>
<p class="MsoNormal"><span lang="EN-US" ><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" ><o:p></o:p></span></p>
<p class="MsoNormal"><b ><i ><span lang="EN-US" ><span ></span></span></i></b><b ><i ><span lang="EN-US" >-3</span></i></b><b ><i ><span lang="EN-US" >: </span></i></b><b ><i ><span lang="EN-US" ><span ></span></span></i></b><b ><i ><span lang="EN-US" >reordering problem</span></i></b><span lang="EN-US" > <o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" ><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" >Returned for any problem on a reordering stage (phase 11)<o:p></o:p></span></p>
<p class="MsoNormal"><i ><span ><span lang="EN-US" >PARDISO messages:<o:p></o:p></span></span></i></p>
<p class="MsoNormal"><span lang="EN-US" ><span ></span><i >"*** error PARDISO: reordering, symbolic factorization"<o:p></o:p></i></span></p>
<p class="MsoNormal"><span lang="EN-US" ><o:p></o:p></span></p>
<p class="MsoNormal"><b ><i ><span lang="EN-US" ><span ></span></span></i></b><b ><i ><span lang="EN-US" >-4: <span ></span></span></i></b><b ><i ><span lang="EN-US" >zero pivot, numerical factorization or iterative refinement problem</span></i></b><span lang="EN-US" > <o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" ><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" >Let us start with a citation of the Intel MKL manual:<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" ><o:p></o:p></span></p>
<p class="MsoNormal"><i ><span lang="EN-US" >Using </span></i><i><span lang="EN-US" >phase </span></i><i ><span lang="EN-US" >=33 </span></i><i ><span lang="EN-US" >results in an error message (</span></i><b ><i><span lang="EN-US" >error </span></i></b><b ><i ><span lang="EN-US" >=4, should be -4 *</span></i></b><b ><i ><span lang="EN-US" > </span></i></b><i ><span lang="EN-US" >) <span ></span>if the stopping criteria for the Krylow-Subspace iteration cannot be reached. <o:p></o:p></span></i></p>
<p class="MsoNormal"><span lang="EN-US" ><o:p></o:p></span></p>
<p class="MsoNormal"><i ><span lang="EN-US" >If </span></i><i><span lang="EN-US" >phase</span></i><i ><span lang="EN-US" >= 23</span></i><i ><span lang="EN-US" >, then the factors </span></i><i><span lang="EN-US" >L</span></i><i ><span lang="EN-US" >, </span></i><i><span lang="EN-US" >U </span></i><i ><span lang="EN-US" >are recomputed for the matrix </span></i><i><span lang="EN-US" >A </span></i><i ><span lang="EN-US" >and the error flag </span></i><b ><i><span lang="EN-US" >error</span></i></b><b ><i ><span lang="EN-US" >=0 </span></i></b><i ><span lang="EN-US" >in case of a successful factorization. If </span></i><i><span lang="EN-US" >phase </span></i><i ><span lang="EN-US" >=33</span></i><i ><span lang="EN-US" >, then </span></i><b ><i><span lang="EN-US" >error </span></i></b><b ><i ><span lang="EN-US" >= -4</span></i></b><i ><span lang="EN-US" > </span></i><i ><span lang="EN-US" >signals the failure<o:p></o:p></span></i></p>
<p class="MsoNormal"><i ><span lang="EN-US" >If the solver detects a zero or negative pivot for these matrix types, the factorization is stopped, PARDISO returns immediately with an error (</span></i><b ><i><span lang="EN-US" >error </span></i></b><b ><i ><span lang="EN-US" >= -4</span></i></b><i ><span lang="EN-US" >) and </span></i><i><span lang="EN-US" >iparm</span></i><i ><span lang="EN-US" >(30) </span></i><i ><span lang="EN-US" >contains the number of the equation where the first zero or negative pivot is detected.</span></i><span lang="EN-US" ><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" ><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" >The error returned in the case of any problem at the factorization stage (phase 22) or at the iterative refinement stage of solution. <o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" ><o:p></o:p></span></p>
<p class="MsoNormal"><i ><span ><span lang="EN-US" >PARDISO messages:<o:p></o:p></span></span></i></p>
<p class="MsoNormal"><i ><span lang="EN-US" >"*** Error in PARDISO: cgs error iparam(20) %d"</span></i><span lang="EN-US" > – prints iparm(20) – see CG / CGS diagnostics in the Intel MKL manual<o:p></o:p></span></p>
<p class="MsoNormal"><i ><span lang="EN-US" ><o:p></o:p></span></i></p>
<p class="MsoNormal"><i ><span lang="EN-US" >"*** error PARDISO: iterative refinement"<o:p></o:p></span></i></p>
<p class="MsoNormal"><span ><span lang="EN-US" ><i>" contraction rate is greater than 0.9, interrupt" – </i>rate of contraction is too small <i><o:p></o:p></i></span></span></p>
<p class="MsoNormal"><i ><span lang="EN-US" ><o:p></o:p></span></i></p>
<p class="MsoNormal"><i ><span lang="EN-US" >"*** error PARDISO: iterative refinement"<o:p></o:p></span></i></p>
<p class="MsoNormal"><i ><span lang="EN-US" >" exceeds max. iteration number %d"<span > </span></span></i><span lang="EN-US" >- prints abs(iparm(8))<o:p></o:p></span></p>
<p class="MsoNormal"><i ><span lang="EN-US" ><o:p></o:p></span></i></p>
<p class="MsoNormal"><i ><span lang="EN-US" >"*** Error in PARDISO: internal error, insufficient memory factorization" – </span></i><span lang="EN-US" >looks like a problem at the factorization stage except for pivoting issues (see errors below)<o:p></o:p></span></p>
<p class="MsoNormal"><i ><span lang="EN-US" ><o:p></o:p></span></i></p>
<p class="MsoNormal"><i ><span lang="EN-US" >"*** Error in PARDISO: zero or negative pivot, A is not SPD-matrix" – </span></i><span lang="EN-US" >original matrix (almost) not SPD one (probably due to computer arithmetic inaccuracies)<o:p></o:p></span></p>
<p class="MsoNormal"><i ><span lang="EN-US" ><o:p></o:p></span></i></p>
<p class="MsoNormal"><i ><span lang="EN-US" >"*** Error in PARDISO: zero pivot" – </span></i><span lang="EN-US" >the same as above but for other matrix types<i ><o:p></o:p></i></span></p>
<p class="MsoNormal"><span lang="EN-US" ><o:p></o:p></span></p>
<p class="MsoNormal"><b ><i ><span lang="EN-US" ><span ></span>-5</span></i></b><b ><i ><span lang="EN-US" >: </span></i></b><b ><i ><span lang="EN-US" ><span ></span></span></i></b><b ><i ><span lang="EN-US" >unclassified (internal) error</span></i></b><b ><i ><span lang="EN-US" ><o:p></o:p></span></i></b></p>
<p class="MsoNormal"><span lang="EN-US" ><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" >This error value is not used currently and is reserved for the future use.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" ><o:p></o:p></span></p>
<p class="MsoNormal"><b ><i ><span lang="EN-US" ><span ></span></span></i></b><b ><i ><span lang="EN-US" >-6: <span ></span></span></i></b><b ><i ><span lang="EN-US" >preordering failed (matrix types 11, 13 only)</span></i></b><span lang="EN-US" > <o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" ><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" >This error value is returned in the case of any problem at the stage of preparation for reordering (in matching algorithm).<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" ><o:p></o:p></span></p>
<p class="MsoNormal"><i ><span ><span lang="EN-US" >PARDISO messages:<o:p></o:p></span></span></i></p>
<p class="MsoNormal"><span lang="EN-US" ><span ></span><i >"*** Error in PARDISO: preordering failed after %d neqns out of %d"<o:p></o:p></i></span></p>
<p class="MsoNormal"><i ><span lang="EN-US" >"structure singular or input/parameter problem (matrix type 11,13)"<o:p></o:p></span></i></p>
<p class="MsoNormal"><span lang="EN-US" ><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" >Looks like the message provides no meaningful information.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" ><o:p></o:p></span></p>
<p class="MsoNormal"><b ><i ><span lang="EN-US" ><span ></span>-7: <span ></span></span></i></b><b ><i ><span lang="EN-US" >diagonal matrix problem</span></i></b><span lang="EN-US" > <o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" ><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" >PARDISO prints no messages. <o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" ><o:p></o:p></span></p>
<p class="MsoNormal"><b ><i ><span lang="EN-US" ><span ></span></span></i></b><b ><i ><span lang="EN-US" >-8: <span ></span></span></i></b><b ><i ><span lang="EN-US" >32-bit integer overflow problem</span></i></b><span lang="EN-US" > <o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" ><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" >This error value is returned on 32-bit architecture for big matrices when indices become greater than the maximal integer value on this platform. <o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" ><o:p></o:p></span></p>
<p class="MsoNormal"><i ><span ><span lang="EN-US" >PARDISO messages:<o:p></o:p></span></span></i></p>
<p class="MsoNormal"><span lang="EN-US" ><span ></span><i >"*** error PARDISO: reordering, symbolic factorization"</i><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" ><o:p></o:p></span></p>
<p class="MsoNormal"><b ><i ><span lang="EN-US" >-9: </span></i></b><b ><i ><span lang="EN-US" >not enough memory for OOC</span></i></b><span lang="EN-US" > <o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" ><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" >Let us start with a citation of the Intel MKL manual:<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" ><o:p></o:p></span></p>
<p class="MsoNormal"><i ><span lang="EN-US" >Note that if </span></i><i><span lang="EN-US" >iparm</span></i><i ><span lang="EN-US" >(60) </span></i><i ><span lang="EN-US" >is equal to 1 or 2, and the total peak memory needed for strong local arrays is more than </span></i><i ><span lang="EN-US" >MKL_PARDISO_OOC_MAX_CORE_SIZE</span></i><i ><span lang="EN-US" >, the program stops with <b >error -9</b>. In this case, increase of </span></i><i ><span lang="EN-US" >MKL_PARDISO_OOC_MAX_CORE_SIZE </span></i><i ><span lang="EN-US" >is recommended.</span></i><i ><span lang="EN-US" ><o:p></o:p></span></i></p>
<p class="MsoNormal"><span lang="EN-US" ><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" >This error value is returned when amount of memory available for PARDISO (defined by MKL_PARDISO_OOC_MAX_CORE_SIZE, by default 2000 Mb) is not enough to solve the current matrix. The issue can be resolved by increasing the value for available memory.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" ><o:p></o:p></span></p>
<p class="MsoNormal"><b ><i ><span lang="EN-US" >-10: </span></i></b><b ><i ><span lang="EN-US" >problems with opening OOC temporary files</span></i></b><span lang="EN-US" > <o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" ><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" >This error value is returned when PARDISO can’t create / open temporary files for storing OOC arrays, e.g. in the case of wrong permissions or when files were removed or blocked or not released after the previous steps.<o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" ><o:p></o:p></span></p>
<p class="MsoNormal"><b ><i ><span lang="EN-US" >-11: </span></i></b><b ><i ><span lang="EN-US" >read/write problems with the OOC data file</span></i></b><b ><i ><span lang="EN-US" ><o:p></o:p></span></i></b></p>
<p class="MsoNormal"><span lang="EN-US" ><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" ><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" >This error value is returned when some problems appear in the process of working with files, e.g. in the case of no space left on device or problems with read / write operations because of algorithm issues. <o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" ><o:p></o:p></span></p>
<p class="MsoNormal"><i><span lang="EN-US" >* - the documentation error. Will be fixed in the version 10.3 Update3.</span></i><span lang="EN-US" ><o:p></o:p></span></p>
<p class="MsoNormal"><i><span lang="EN-US" >** - available memory means the RAM system's memory which is available at the moment of starting the calculations</span></i><i><span lang="EN-US" >.</span></i><span lang="EN-US" ><o:p></o:p></span></p>
<p class="MsoNormal"><span lang="EN-US" ><o:p></o:p></span></p> ]]></description>
      <link>http://software.intel.com/en-us/articles/description-of-pardiso-errors-and-messages/</link>
      <pubDate>Fri, 14 Jan 2011 11:30:00 -0800</pubDate>
      <comments>http://software.intel.com/en-us/articles/description-of-pardiso-errors-and-messages/#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/description-of-pardiso-errors-and-messages/</guid>
      <category>Intel® C++ Compiler for Linux* Knowledge Base</category>
      <category>Intel® C++ Compiler for Mac OS X* Knowledge Base</category>
      <category>Intel® C++ Compiler for Windows* Knowledge Base</category>
      <category>Intel® Fortran Compiler for Linux* Knowledge Base</category>
      <category>Intel® Fortran Compiler for Mac OS X* Knowledge Base</category>
      <category>Intel® Math Kernel Library Knowledge Base</category>
    </item>
    <item>
      <title>How to manually target 2nd generation Intel Core processors with support for Intel AVX</title>
      <description><![CDATA[ <p><br /><strong>Product :</strong> Intel C++ Composer XE<br /><br /><strong>Version :</strong> 2011  (contains Intel C++ Compiler 12.0 or 12.1)<br /><br /><br />Manual processor dispatch allows you to write one or more versions of a function that will run only on specified types of Intel processor. The Intel processor type is detected at runtime, and the corresponding function version is executed. This feature is available only for Intel processors of IA-32 or Intel 64 architecture. It is not available for non-Intel processors nor for Intel processors of IA-64 architecture. Applications built with the manual processor dispatch feature may be more highly optimized for Intel processors than for non-Intel processors.<br /><br />The  <strong>__declspec(cpu_ dispatch(cpuid,cpuid,…))</strong>  syntax is used to provide a list of targeted processors along with an empty function body (i.e., a function stub). The <strong>__declspec(cpu_specific(cpuid))</strong> syntax is used to declare each function version that is targeted at a particular type or types of processor.<br /><br />The following table lists possible values for cpuid (names are not case-sensitive):<br /><br />
<table width="100%" cellpadding="0" cellspacing="0" border="1">
<thead>
<tr>
<td width="24%" valign="top">
<p align="center"><b>Argument for cpuid</b></p>
</td>
<td width="75%" valign="top">
<p align="center"><b>Processors</b></p>
</td>
</tr>
</thead>
<tbody>
<tr>
<td width="24%" valign="top">
<p>core_2nd_gen_avx</p>
</td>
<td width="75%" valign="top">
<p>2nd generation Intel® Core<sup>TM</sup> processor family with support for Intel® Advanced Vector Extensions (Intel® AVX).</p>
</td>
</tr>
<tr>
<td width="24%" valign="top">
<p>core_aes_pclmulqdq</p>
</td>
<td width="75%" valign="top">
<p>Intel® Core<sup>TM</sup>  processors with support for Advanced Encryption Standard (AES) instructions and carry-less multiplication instruction</p>
</td>
</tr>
<tr>
<td width="24%" valign="top">
<p>core_i7_sse4_2</p>
</td>
<td width="75%" valign="top">
<p>Intel® Core<sup>TM</sup>  processor family with support for Intel® SSE4 Efficient Accelerated String and Text Processing instructions  (SSE4.2)</p>
</td>
</tr>
<tr>
<td width="24%" valign="top">
<p>atom</p>
</td>
<td width="75%" valign="top">
<p>Intel® Atom<sup>TM</sup> processors</p>
</td>
</tr>
<tr>
<td width="24%" valign="top">
<p>core_2_duo_sse4_1</p>
</td>
<td width="75%" valign="top">
<p>Intel® 45nm Hi-k next generation Intel® Core<sup>TM</sup> microarchitecture processors with support for Intel® SSE4 Vectorizing Compiler and Media Accelerators instructions (SSE4.1)</p>
</td>
</tr>
<tr>
<td width="24%" valign="top">
<p>core_2_duo_ssse3</p>
</td>
<td width="75%" valign="top">
<p>Intel® Core<sup>TM</sup>2 Duo processors and Intel® Xeon® processors with Intel® Supplemental Streaming SIMD Extensions 3 (SSSE3)</p>
</td>
</tr>
<tr>
<td width="24%" valign="top">
<p>pentium_4_sse3</p>
</td>
<td width="75%" valign="top">
<p>Intel® Pentium 4 processor with Intel® Streaming SIMD Extensions 3 (Intel® SSE3), Intel® Core<sup>TM</sup> Duo processors, Intel® Core<sup>TM</sup> Solo processors</p>
</td>
</tr>
<tr>
<td width="24%" valign="top">
<p>pentium_4</p>
</td>
<td width="75%" valign="top">
<p>Intel® Intel Pentium 4 processors</p>
</td>
</tr>
<tr>
<td width="24%" valign="top">
<p>pentium_m</p>
</td>
<td width="75%" valign="top">
<p>Intel® Pentium M processors</p>
</td>
</tr>
<tr>
<td width="24%" valign="top">
<p>pentium_iii</p>
</td>
<td width="75%" valign="top">
<p>Intel® Pentium III processors</p>
</td>
</tr>
<tr>
<td width="24%" valign="top">
<p>generic</p>
</td>
<td width="75%" valign="top">
<p>Other IA-32 or Intel 64 processors or compatible  processors not provided by Intel Corporation</p>
</td>
</tr>
</tbody>
</table>
<br /><br />If no other matching Intel processor type is detected, the “generic” version of the function will be executed. If the program is intended to execute on non-Intel processors, a “generic” function version must be provided. The degree of optimization of the generic function version and the processor features that it assumes are under the control of the programmer.<br /><br />The following framework illustrates how the <strong>cpu_dispatch</strong> and <strong>cpu_specific</strong> keywords might be used to create function versions for the 2nd generation Intel Core processor family, for the Intel Core processor family, for the Intel Core 2 Duo processor family, and for other Intel and compatible, non-Intel processors. Each processor-specific function body might contain processor-specific intrinsic functions, or it might be placed in a separate source file and compiled with a processor-specific compiler option. See <a href="http://software.intel.com/en-us/articles/performance-tools-for-software-developers-intel-compiler-options-for-sse-generation-and-processor-specific-optimizations/">http://software.intel.com/en-us/articles/performance-tools-for-software-developers-intel-compiler-options-for-sse-generation-and-processor-specific-optimizations/</a> for more details of such options.</p>
<pre name="code" class="cpp"><br />#include &lt;stdio.h&gt;

// need to create specific function versions for the following processors:
__declspec(cpu_dispatch(generic, core_2_duo_ssse3, core_i7_sse4_2, core_2nd_gen_avx))
void dispatch_func() {};      //  stub that will call the appropriate specific function version

__declspec(cpu_specific(generic))
void dispatch_func() {
printf("\nCode for non-Intel processors and generic Intel processors goes here\n");
}

__declspec(cpu_specific(core_2_duo_ssse3))
void dispatch_func() {
printf("\nCode for Intel Core 2 Duo processors with support for SSSE3 goes here\n");
}

__declspec(cpu_specific(core_i7_sse4_2))
void dispatch_func() {
printf("\nCode for Intel Core processors with support for SSE4.2 goes here\n");
}

__declspec(cpu_specific(core_2nd_gen_avx))
void dispatch_func() {
printf("\nCode for 2nd generation Intel Core processors goes here\n");
}

int main() {
dispatch_func();
printf("Return from dispatch_func\n");
return 0;
}
</pre>
<p><br /><br />
<table cellpadding="5" cellspacing="0" rules="none" border="1">
<tbody>
<tr>
<th align="left" valign="middle" >Optimization Notice</th>
</tr>
<tr bgcolor="#ccecff">
<td>
<p>Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.</p>
<p align="right">Notice revision #20110804</p>
</td>
</tr>
</tbody>
</table>
</p>
<p> </p>
<p><i>[DISCLAIMER: The information on this web site is intended for hardware system manufacturers and software developers. Intel does not warrant the accuracy, completeness or utility of any information on this site. Intel may make changes to the information or the site at any time without notice. Intel makes no commitment to update the information at this site. ALL INFORMATION PROVIDED ON THIS WEBSITE IS PROVIDED "as is" without any express, implied, or statutory warranty of any kind including but not limited to warranties of merchantability, non-infringement of intellectual property, or fitness for any particular purpose. Independent companies manufacture the third-party products that are mentioned on this site. Intel is not responsible for the quality or performance of third-party products and makes no representation or warranty regarding such products. The third-party supplier remains solely responsible for the design, manufacture, sale and functionality of its products. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. *Other names and brands may be claimed as the property of others.]</i></p> ]]></description>
      <link>http://software.intel.com/en-us/articles/how-to-manually-target-2nd-generation-intel-core-processors/</link>
      <pubDate>Thu, 13 Jan 2011 00:00:00 -0800</pubDate>
      <comments>http://software.intel.com/en-us/articles/how-to-manually-target-2nd-generation-intel-core-processors/#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/how-to-manually-target-2nd-generation-intel-core-processors/</guid>
      <category>Intel® C++ Compiler for Linux* Knowledge Base</category>
      <category>Intel® C++ Compiler for Mac OS X* Knowledge Base</category>
      <category>Intel® C++ Compiler for Windows* Knowledge Base</category>
      <category>Intel® Parallel Composer Knowledge Base</category>
    </item>
    <item>
      <title>Don&amp;#39;t optimize when using -ftrapuv for uninitialized variable detection</title>
      <description><![CDATA[ <br />
<div id="art_pre_template"><strong>Reference Number :</strong> dpd200139115,  dpd200138937  (documentation).<br /><br /><br /><strong>Version :</strong> 2011, (compiler version 12);  compiler pro versions 10 and 11 <br /><br /><br /><strong>Product : </strong>Intel(R) Composer XE; Intel(R) Compiler Pro<br /><br /><br /><strong>Operating System :</strong> Windows*, Linux*, Mac OS* X<br /><br /><b>Problem Description : </b><br />If the switch /Qftrapuv (-ftrapuv) for run-time detection of uninitialized local scalar variables is used in conjunction with optimization flags such as /O2 (-O2), it may lead to unexpected floating-point exceptions that are not related to uninitialized variables.<br /><br /><b>Explanation : </b><br />The switch /Qftrapuv (-ftrapuv)  sets local, scalar variables that are not otherwise initialized to an "unusual" initial value such as 0xCCCCCCCC. When the main program is compiled with this switch, it also unmasks the "INVALID" floating-point exception. so that exceptions may be raised when the "unusual" values are used in floating-point operations. The switch also changes the default optimization level from /O2 to /Od (from -O2 to -O0). This is so that exceptions will not be raised as a result of speculated floating-point operations or other optimizations. If the new default optimization level of /Od (-O0) is explicitly overridden, optimizations such as floating-point speculation associated with masked vector operations may result in INVALID exceptions that would not otherwise have been raised.<br /><br /><br /><strong>Solution</strong>:<br /><strong>Either</strong>:  Do not override the default /Od (-O0) optimization level when using /Qtrapuv (-ftrapuv).   (Sometimes, it may be sufficient to use /Qfp-speculation:safe (-fp-speculation safe) in conjunction with -O2).<br /><strong>or (Fortran only)</strong>:   Use the switch /check:uninit (-check uninit)  in preference to /Qtrapuv (-ftrapuv) for the run-time detection of uninitialized, local scalar variables. This uses a different mechanism for uninitialized variable detection that is less likely to produce unrelated floating-point exceptions.<br /><br />Intel(R) Inspector XE, a component of Intel(R) Parallel Studio XE,  may also be used for the detection of some instances of uninitialized variables.<br /><br /><br /><br /><i>[DISCLAIMER: The information on this web site is intended for hardware system manufacturers and software developers. Intel does not warrant the accuracy, completeness or utility of any information on this site. Intel may make changes to the information or the site at any time without notice. Intel makes no commitment to update the information at this site. ALL INFORMATION PROVIDED ON THIS WEBSITE IS PROVIDED "as is" without any express, implied, or statutory warranty of any kind including but not limited to warranties of merchantability, non-infringement of intellectual property, or fitness for any particular purpose. Independent companies manufacture the third-party products that are mentioned on this site. Intel is not responsible for the quality or performance of third-party products and makes no representation or warranty regarding such products. The third-party supplier remains solely responsible for the design, manufacture, sale and functionality of its products. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. *Other names and brands may be claimed as the property of others.]</i></div> ]]></description>
      <link>http://software.intel.com/en-us/articles/dont-optimize-when-using-ftrapuv-for-uninitialized-variable-detection/</link>
      <pubDate>Wed, 22 Dec 2010 21:00:00 -0800</pubDate>
      <comments>http://software.intel.com/en-us/articles/dont-optimize-when-using-ftrapuv-for-uninitialized-variable-detection/#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/dont-optimize-when-using-ftrapuv-for-uninitialized-variable-detection/</guid>
      <category>Intel® C++ Compiler for Linux* Knowledge Base</category>
      <category>Intel® C++ Compiler for Mac OS X* Knowledge Base</category>
      <category>Intel® C++ Compiler for Windows* Knowledge Base</category>
      <category>Intel® Fortran Compiler for Linux* Knowledge Base</category>
      <category>Intel® Fortran Compiler for Mac OS X* Knowledge Base</category>
      <category>Intel® Parallel Composer Knowledge Base</category>
      <category>Intel® Visual Fortran Compiler for Windows* Knowledge Base</category>
    </item>
    <item>
      <title>Array Arguments to memcpy() Should Not Overlap</title>
      <description><![CDATA[ <div id="art_pre_template"><br /><b>Version : 2011  (contains Intel Compiler version 12.0)</b><br /><br /><br /><b>Product : Intel Parallel Composer, Intel Composer XE</b><br /><br /><br /><b>Operating System : Windows, Linux, Mac OS X</b><br /><br /><br /><b>Problem Description : </b><br />The Intel Compilers version 12 contain an optimized implementation of memcpy() that may give unexpected results if the input and output buffers overlap. The semantics of memcpy() specify that the input and output buffers should be independent; however, less optimized versions in earlier Intel compilers gave the expected result, even in the case of overlapping buffers.<br /><br /><b>Solution : <br /></b>If there is a possibility that input and output buffers might overlap, memmove() should be called instead of memcpy().<br /><br />The Fortran language standard does not allow overlapping function arguments; therefore, the Intel Fortran compiler may generate memcpy() calls for simple routines that copy arrays, assuming that this is safe. If your code calls subroutines or functions with array arguments that might overlap, you should build with the option /assume:dummy_aliases (Windows) or -assume dummy_aliases (Linux or Mac OS X). Or better still, restructure your code in compliance with the Fortran standard to avoid any possibility of array argument overlap.<br /><br />The Intel C/C++ compiler assumes by default that function array arguments might overlap. It will protect against overlapping arguments unless you build with /Qalias-args- (Windows)  or  –fargument-noalias (Linux or Mac OS X) or equivalent.<br /><br /><br /><br /><br /><i>[DISCLAIMER: The information on this web site is intended for hardware system manufacturers and software developers. Intel does not warrant the accuracy, completeness or utility of any information on this site. Intel may make changes to the information or the site at any time without notice. Intel makes no commitment to update the information at this site. ALL INFORMATION PROVIDED ON THIS WEBSITE IS PROVIDED "as is" without any express, implied, or statutory warranty of any kind including but not limited to warranties of merchantability, non-infringement of intellectual property, or fitness for any particular purpose. Independent companies manufacture the third-party products that are mentioned on this site. Intel is not responsible for the quality or performance of third-party products and makes no representation or warranty regarding such products. The third-party supplier remains solely responsible for the design, manufacture, sale and functionality of its products. Intel and the Intel logo are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. *Other names and brands may be claimed as the property of others.]</i></div> ]]></description>
      <link>http://software.intel.com/en-us/articles/array-arguments-to-memcpy-should-not-overlap/</link>
      <pubDate>Wed, 15 Dec 2010 21:00:00 -0800</pubDate>
      <comments>http://software.intel.com/en-us/articles/array-arguments-to-memcpy-should-not-overlap/#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/array-arguments-to-memcpy-should-not-overlap/</guid>
      <category>Intel® C++ Compiler for Linux* Knowledge Base</category>
      <category>Intel® C++ Compiler for Mac OS X* Knowledge Base</category>
      <category>Intel® C++ Compiler for Windows* Knowledge Base</category>
      <category>Intel® Fortran Compiler for Linux* Knowledge Base</category>
      <category>Intel® Fortran Compiler for Mac OS X* Knowledge Base</category>
      <category>Intel® Parallel Composer Knowledge Base</category>
      <category>Intel® Visual Fortran Compiler for Windows* Knowledge Base</category>
    </item>
  </channel></rss>
