<?xml version="1.0" encoding="UTF-8"?>
<!-- Generated on Wed, 25 Nov 2009 10:15:45 -0800 -->
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <atom:link href="http://software.intel.com/en-us/articles/intel-parallel-amplifier-kb/type/tips-and-techniques/feed/" rel="self" type="application/rss+xml" />
    <title>Intel Software Network articles feed</title>
    <link>http://software.intel.com/en-us/articles/intel-parallel-amplifier-kb/tips-and-techniques/</link>
    <description></description>
    <language>en-us</language>
    <item>
      <title>Use command line in Intel(R) Parallel Amplifier</title>
      <description><![CDATA[ <p>The Intel® Parallel Studio software contains Intel® Parallel Amplifier. This tool runs as an add-in to Microsoft* Visual Studio*.  However, sometimes we need to automate our measurement work (e.g., via a script), and not run the IDE to avoid (reduce) impact on the system.</p>
<p>Intel® Parallel Amplifier Update 1 has realized this feature.</p>
<p> </p>
<p>After installing the product, open a Command Prompt window and run the following command:</p>
<p><code>&gt; C:\Program Files\Intel\Parallel Studio\<b>Amplifier\</b><b>ampl-vars.bat</b></code></p>
<p>Now, the environment for command line has been configured and you can use the <b>ampl-cl </b>command<b>.<br /></b></p>
<p> </p>
<p>You can run <code>ampl-cl -help</code> to get familiar with the syntax of command.</p>
<p><b>Usage:</b></p>
<p>ampl-cl &lt;-action-option [-modifier-option] [[--] target [target options]]</p>
<p> </p>
<p>Action-option contains：collect, collect-list, report, report-list, finalize, help, version</p>
<p> </p>
<p>Modifier-option contains： cumulative-threshold-percent, resume-after, result-dir, search-dir, start-pause, user-data-dir, verbose</p>
<p> </p>
<p><b>Examples</b>:</p>
<ul>
<li>List all data collectors<br /><code>&gt; ampl-cl -collect-list</code></li>
<li>run target application, collect hot functions, save results in default sub-directory<br /><code>&gt; ampl-cl -collect hotspots matrix.exe</code></li>
<li>run target application, collect hot functions, save result in specified sub-directory<br /><code>&gt; ampl-cl -collect concurrency -result-dir r009cc matrix.exe</code></li>
<li>run target application, delay 3000 million seconds (e.g., 3 sec) to collect times for all locks and waits, save results in default sub-directory<br /><code>&gt; ampl-cl -collect lockswaits -start-paused -resume-after=3000 matrix.exe</code></li>
<li>List all report types<br /><code>&gt; ampl-cl -report-list</code></li>
<li>Report a result for recent running<br /><code>&gt; ampl-cl -report perf</code></li>
<li>Report a result based on data  from specified sub-directory, use comma as a delimiter<br /><code>&gt; ampl-cl -report perf-detail -csv-delimiter="," -result-dir r012hs</code></li>
<li>Report enumerates modules that consumes 80% of total CPU time<br /><code>&gt; ampl-cl -report perf -result-dir r012cc -cumulative-threshold-percent=80</code></li>
<li>Compare results for two sessions<br /><code>&gt; ampl-cl -report summary -result-dir r012cc -result-dir r009cc</code></li>
</ul> ]]></description>
      <link>http://software.intel.com/en-us/articles/use-command-line-in-intelr-paralle-amplifier</link>
      <pubDate>Tue, 13 Oct 2009 14:21:51 -0700</pubDate>
      <comments>http://software.intel.com/en-us/articles/use-command-line-in-intelr-paralle-amplifier#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/use-command-line-in-intelr-paralle-amplifier</guid>
      <category>Intel® Parallel Amplifier Knowledge Base</category>
    </item>
    <item>
      <title>Call stack is not full, contains [Unknown] functions or seems incorrect</title>
      <description><![CDATA[ <br />
<div id="art_pre_template"><b>Problem : </b><br />
<p><br />Intel® Parallel Amplifier provides a call stack for functions in the results view. For some applications these call stacks might appear incomplete or seem incorrect according to known application work flow, even with .pdb files for the project binaries. Additionally, there could be '[Unknown]' identifiers within the function call chain displayed in the stack view. Generally, the problem appears when there is difficulty defining the call path to the user code because of system functions on top of the stack.</p>
<p> </p>
<p>See the illustrative Hotspot results of analysis the multithreaded test case below. Here are several run-time functions (called in ntdll.dll) on the top of the list. It is not clear from the Call Stack View or Top-Down Tree to which user functions those calls are descending though the stack. Additionally, there are ‘Unknown’ identifiers which refer to function and module names.</p>
<p> </p>
<br /><img title="ss1.JPG" alt="ss1.JPG" src="http://software.intel.com/file/22442" /><br /><br /><img title="ss2.JPG" alt="ss2.JPG" src="http://software.intel.com/file/22443" /><br /><br />
<p> </p>
<p>Let’s study the source code of the analyzed application. The ExtendBuffer function is extending a buffer chain by allocating some memory for a new node in the linked list of buffers. ExtendBuffer is supposed to take most of the program execution time as it's being called in the loop within many threads and there are no other functions that do enough calculations to consume noticeable CPU time.</p>
<p> </p>
<pre name="code" class="cpp:nocontrols:nogutter">#define THREAD_NUM 16

struct chain
{
	size_t size;
	char* buf;
	chain* p_c;
};

chain* ExtendBuffer(chain* p, size_t n)
{
	chain* p_new = new chain;
	p_new-&gt;p_c = 0;
	p_new-&gt;size = n;
	p_new-&gt;buf	= new char[n];
	p-&gt;p_c = p_new;
	return p_new;
}
DWORD WINAPI TestFunc(LPVOID param)
{
	chain* p = new chain;
	for (int i=0;i&lt;10000;i++)
		p = ExtendBuffer(p, 4);
	return 0;
}
void main()
{
	DWORD idThread[THREAD_NUM];
	HANDLE hThread[THREAD_NUM];
	for(int i=0;i&lt;THREAD_NUM;i++)
		hThread[i]=CreateThread(NULL,0,TestFunc,0,0,&amp;idThread[i]);
	WaitForMultipleObjects(THREAD_NUM,hThread,TRUE,INFINITE);
}
</pre>
<p> </p>
<p>Getting back to the Hotspot results there is no ExtendBuffer function in the list. If searching for the known user functions (e.g. ExtendBuffer or TestFunc), they might be found under '[Unknown frame(s)]' function identifier in the stack and the attributed CPU time is not that high as expected, which is confusing.</p>
<p> </p>
<b><br />Root Cause : </b><br /><br />Such a problem with an incomplete call stack is caused by the absence of debug info in the system modules, like ntdll.dll. Many samples are taken from the kernel module, but the data collector was not able to unwind the stack properly due to stack frame info absence.<br /><br /><b><br /><br />Resolution : </b><br /><br />
<p>Specify paths to the Microsoft* symbol server in the Microsoft Visual Studio*, for example, http://msdl.microsoft.com/download/symbols, in Tools &gt; Options &gt; Debugging &gt;  Symbols page.</p>
<img title="ss3.JPG" alt="ss3.JPG" src="http://software.intel.com/file/22444" /><br /><br />
<p>For more information regarding the Microsoft Symbol Server, please see <a href="http://www.microsoft.com/whdc/devtools/debugging/debugstart.mspx" linkindex="6">http://www.microsoft.com/whdc/devtools/debugging/debugstart.mspx</a></p>
<p>Intel® Parallel Amplifier will use the symbol files cached in the C:\websymbols directory, as it is set in the example, and provide a more complete call stack:<br /><br /><img title="ss4.JPG" alt="ss4.JPG" src="http://software.intel.com/file/22445" /><br /><br /><img title="ss5.JPG" alt="ss5.JPG" src="http://software.intel.com/file/22446" /><br /><br />Now the source lines are resolved properly and with a double click onto the hotspot function a user will be navigated to the right source code:<br /><br /><img title="ss6.JPG" alt="ss6.JPG" src="http://software.intel.com/file/22447" /></p>
</div> ]]></description>
      <link>http://software.intel.com/en-us/articles/call-stack-is-not-full</link>
      <pubDate>Wed, 30 Sep 2009 06:12:05 -0700</pubDate>
      <comments>http://software.intel.com/en-us/articles/call-stack-is-not-full#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/call-stack-is-not-full</guid>
      <category>Tools</category>
      <category>Intel® Parallel Amplifier</category>
      <category>Intel® Parallel Amplifier Knowledge Base</category>
    </item>
    <item>
      <title>Parallel Implementation Methods with Intel® Parallel Composer Webinar Q&amp;A</title>
      <description><![CDATA[ <p>Q&amp;A from <a href="https://event.on24.com/event/36/88/3/rt/1/index.html."><strong>Webcast</strong></a>: The webinar "Parallel Implementation Methods with Intel® Parallel Composer" was presented by Ganesh Rao, March 31<sup>st</sup>, 2009, as part of our technical webinar series about Multithreading tools and techniques. The following questions were selected from the list of questions and answers generated by this web cast, and may be useful to other developers as reference.</p>
<p><b>Q:  </b><b>SSE2 is a standard and it is supported by AMD. Why is it set as "Intel </b><b>Processor Specific <br />     Optimization?<br /></b><b>A:  </b>The<b> </b>Intel® C++ compiler can generate code targeting any processor with SSE2 instruction set support, <br />     using /Qax:SSE2 or can generate code that is optimized for Intel processors with SSE2 support using /QxSSE2. <br />     The latter offers more optimizations and it does a cpu check when the application is executed.<br /><b><br />Q:  </b><b>Do you need to #include&lt;omp.h&gt; to be able to use #pragma omp?<br /></b><b>A:  </b>No, but you need to include omp.h if you want to call the OpenMP APIs such as omp_get_num_threads(), <br />     omp_thread_num(), etc.<br /><b><br />Q:  </b><b>If </b><b>I am using omp dlls, will concurrent omp applications (processes) </b><b>compete for CPU resources the <br />     same way as it is when I use static omp </b><b>libraries?<br /></b><b>A:  </b> Testing shows there is practically no performance advantage to linking mulitple processes with the static OpenMP runtime, as opposed to linking with the dynamic runtime (uses DLLs), which is the default.  But in either linking scenario, you have to be careful not to oversubscribe the machine if using multiple OpenMP processes.  With multiple, independent OpenMP processes running on a host, the OpenMP runtime library execution mode should be 'throughput' (environment variable KMP_LIBRARY), which is the default.  On the other hand, if you have a dedicated host, the runtime library execution mode should be 'turnaround', which will minimize the execution time of a single OpenMP process.<br /><b><br />Q:  </b><b>How is the binary compatibility between the Intel compiler and the </b><b>Visual Studio compiler maintained when VS only supports OpenMP 2.5?<br /></b><b>A:  </b>If you want to use any new features in OpenMP 3.0, you have to use the Intel® Parallel Composer, but if you only <br />     use the features from OpenMP 2.5, you can use Visual C++ 2005 or 2008 or the Intel Parallel Composer. Please <br />     refer to this knowledge base article for more information at:<br />           <a href="http://software.intel.com/en-us/articles/how-to-use-intelr-compiler-openmp-compatibility-libraries-on-windows/">http://software.intel.com/en-us/articles/how-to-use-intelr-compiler-openmp-compatibility-libraries-on-windows/</a></p>
<p> <b>Q:  May be it's better to use 4 threads for 2 cores? (i.e. Number threads = </b><b>Number Cores * 2?<br /></b><b> A:  </b>Probably not unless the machine is hyperthreaded, and hyperthreading is enabled in the BIOS; otherwise you will <br />      oversubscribe the machine.  In general shouldn't give any OpenMP process more than the number of machine logical threads (number-of-processors * number-of-cores/processor * number-of-threads/core), and in fact you might find it better to limit the total of number of OpenMP threads to something less.  Testing shows that the <br />      performance penalty by oversubscribing the machine can be severe.  In general, it is OK to use all the machine's logical threads for one OpenMP process (and this is the default, unless you explicitly change the number by setting OMP_NUM_THREADS or by calling omp_set_num_threads()), but depending on your host usage scenario, you should set the environment for 'throughput' (multiple OpenMP processes running) or 'turnaround' (single, dedicated OpenMP process).<br /><b><br />Q:  </b>What is the difference between using omp parallel task versus using omp parallel sections calling a task?<br /><b>A:  </b>Calling a task within a section just creates extra overhead and cannot control and synchronize the tasks since <br />     each parallel section is independent of each other. OpenMP 3.0 tasking is more flexible and efficient compared to <br />     using parallel sections.  With parallel sections, there is no way to coordinate the task in each section, so it is not <br />     possible to determine whether one section will be executed before another, regardless of which section comes first <br />     in the program source.  On the other hand, the task directive can take an "if" clause to cause the task to be <br />     executed immediately or be deferred; a thread can be "hard wired" to a task (called "tied"), or can be "untied", <br />     which allows any available thread in the thread pool to start executing the task.  Tasking has much better <br />     performance and scalability for nested parallel and recursive algorithms, compared to parallel sections.  There is <br />     much more overhead creating and destroying the nested parallel regions (parallel sections tasking), versus <br />     executing tasks which are all created by a single parallel region containing a task directive.  You can control the total number of threads (OMP_NUM_THREADS) with tasking, whereas with nested parallel regions (sections), with each newly created region you get OMP_NUM_THREADS new threads, and that can easily oversubscribe the host.<br /><b><br />Q:  Do you have a favorite textbook about OpenMP that you </b><b>recommend?<br /></b><b>A:  </b>Please look at the "Related Information" section in the product user guide where there is a mention of associated <br />     Intel Documents. A good recommended book to look at: "Using OpenMP: Portable Shared Memory Parallel <br />     Programming" by Barbara Chapman, Gabriele Jost, and Ruud van der Pas</p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p> </p>
<p> </p> ]]></description>
      <link>http://software.intel.com/en-us/articles/parallel-implementation-methods-with-intel-parallel-composer-webinar-qa</link>
      <pubDate>Wed, 19 Aug 2009 17:10:55 -0700</pubDate>
      <comments>http://software.intel.com/en-us/articles/parallel-implementation-methods-with-intel-parallel-composer-webinar-qa#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/parallel-implementation-methods-with-intel-parallel-composer-webinar-qa</guid>
      <category>Intel® Parallel Amplifier Knowledge Base</category>
      <category>Intel® Parallel Composer Knowledge Base</category>
      <category>Intel® Parallel Inspector Knowledge Base</category>
    </item>
    <item>
      <title>How am I notified of updates for my registered products?</title>
      <description><![CDATA[ <p> </p>
<div><b>Problem : </b><br />How am I notified of updates for my registered products?<br /><br /><b>Resolution : </b><br />1) Login to the Intel® Registration Center by entering your Login ID and Password in the Registered Users Login section  of  the web page. You will see a list of all products you have subscribed;<br />2) Click on My account/Change notification preference on My products page below.<br /><br /><img src="http://software.intel.com/file/21592" alt="Notifacation+update+1.PNG" title="Notifacation+update+1.PNG" /><br /><br />3) You will be directed to the page below, check the box of <label for="ctl00_MainContentPlaceHolder_ckYesFilesNotification">Yes, I would like to receive Intel® Software Product update notifications.<br /><img src="http://software.intel.com/file/21593" alt="Notifacation+update+2.PNG" title="Notifacation+update+2.PNG" /><br /><br />4) Click Save Notification Prefences.<br /><br />You will then receive the registered product update notifications.</label></div> ]]></description>
      <link>http://software.intel.com/en-us/articles/how-notified-updates-for-products</link>
      <pubDate>Mon, 10 Aug 2009 00:40:04 -0700</pubDate>
      <comments>http://software.intel.com/en-us/articles/how-notified-updates-for-products#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/how-notified-updates-for-products</guid>
      <category>Intel® C++ Compiler for Linux* Knowledge Base</category>
      <category>Intel® C++ Compiler for Mac OS X* Knowledge Base</category>
      <category>Intel® C++ Compiler for Windows* Knowledge Base</category>
      <category>Intel® Cluster Toolkit for Linux* Knowledge Base</category>
      <category>Intel® Cluster Toolkit for Windows* Knowledge Base</category>
      <category>Intel® Fortran Compiler for Linux* Knowledge Base</category>
      <category>Intel® Fortran Compiler for Mac OS X* Knowledge Base</category>
      <category>Intel® Math Kernel Library Knowledge Base</category>
      <category>Intel® Parallel Amplifier Knowledge Base</category>
      <category>Intel® Parallel Composer Knowledge Base</category>
      <category>Intel® Parallel Inspector Knowledge Base</category>
      <category>Intel® Software Development Products Registration Center Knowledge Base</category>
      <category>Intel® Visual Fortran Compiler for Windows* Knowledge Base</category>
      <category>Intel® VTune™ Performance Analyzer for Linux* Knowledge Base</category>
      <category>Intel® VTune™ Performance Analyzer for Windows* Knowledge Base</category>
    </item>
    <item>
      <title>Use VTuneAPI in Intel® Parallel Amplifier for selective code profiling</title>
      <description><![CDATA[ <p>VTuneAPI is a set of APIs from VTune<sup>TM</sup>Performance Analyzer - the user can ignore (don't collect) the profiling data when the application is running on non-interest of code, e.g. 3rd-party libraries. VTuneAPI  has Pause API and Resume API to set profiling control in user's C/C++ program, also there are other APIs to configure sampling / call graph data collection, and generate final result, etc. See <a href="http://cache-www.intel.com/cd/00/00/21/93/219345_sampling_vtune.pdf">http://cache-www.intel.com/cd/00/00/21/93/219345_sampling_vtune.pdf</a>, page 8 in detail.</p>
<p> </p>
<p>Dump VTuneAPI all export functions:</p>
<p>1 000012F4 VTBindSamplingResults</p>
<p>2 0000126C VTNameThread</p>
<p><b>3 00001234  VTPause</b></p>
<p>4 0000135C VTunePauseCounterMonitor</p>
<p>5 00001324 VTunePauseSampling</p>
<p><b>6 00001250 VTResume</b></p>
<p>7 00001378 VTuneResumeCounter Monitor</p>
<p>8 00001340 VTuneResumeSampling</p>
<p>9 00001294 VTStartSampling</p>
<p>10 000012C4 VTStopSampling</p>
<p> </p>
<p>Fortunately <strong>Intel® Parallel Amplifier </strong>includes this feature in the product. Simply the user can use VTPause/VTResume to control data profiling. There is no necessary to use VTStartSampling/VTStopSampling since the user can do it on the Parallel Amplifier's user interface.</p>
<p>a)       Include file - <strong>VtuneApi.h</strong> under Parallel Studio\Amplifier\include</p>
<p>b)       Lib file - <strong>VtuneApi.dll</strong>under Parallel Studion\Amplifer\bin32\runtime, or Parallel Studion\Amplifer\bin64\runtime</p>
<p> </p>
<p>Here are steps to use VTuneAPI to control data collection.</p>
<ol type="1">
<li>Set new path for Vtuneapi.h in<b> Include</b> <b>Directories</b>, Microsoft* Visual Studio* -&gt;Tools-&gt;Options-&gt;VC++ Directories-&gt;Include files</li>
<li>Copy VtuneApi.dll to Project's Release directory which includes generated application.</li>
<li>In Microsoft* Visual Studio, right-click on the project -&gt; Intel Parallel Amplifier-&gt;Project Properties-&gt;check on "Start data collection paused" &amp; check off "Resume collection after sec."</li>
<li><img src="http://software.intel.com/file/21417" alt="vtuneapi.bmp" title="vtuneapi.bmp" /></li>
<li>Insert VTuneAPI code in user's source such as below example, then rebuild the project</li>
</ol>
<p>      #include "Vtuneapi.h"</p>
<p>      ......</p>
<p>       typedef void (*VTFUNC)(void);</p>
<p>       HMODULE hMod;</p>
<p>       VTFUNC vtResume, vtPause;</p>
<p>       hMod = LoadLibrary("VtuneApi.dll");</p>
<p>      </p>
<p>       vtResume = (VTFUNC) GetProcAddress(hMod, "VTResume");</p>
<p>       vtPause = (VTFUNC) GetProcAddress(hMod, "VTPause");</p>
<p>       ......</p>
<p>       (vtResume());</p>
<p>       // code section 1 - collecting data</p>
<p>       (vtPause());</p>
<p>       // code section 2 - <b>not</b> collecting data</p>
<p>       (vtResume());</p>
<p>      // code section 3 - collecting data</p>
<p>      ......</p>
<p>Finally run the Parallel amplifier to get expected result</p> ]]></description>
      <link>http://software.intel.com/en-us/articles/use-vtuneapi-in-intel-parallel-amplifier-for-selective-code-profiling</link>
      <pubDate>Wed, 29 Jul 2009 02:18:19 -0700</pubDate>
      <comments>http://software.intel.com/en-us/articles/use-vtuneapi-in-intel-parallel-amplifier-for-selective-code-profiling#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/use-vtuneapi-in-intel-parallel-amplifier-for-selective-code-profiling</guid>
      <category>Intel® Parallel Amplifier Knowledge Base</category>
    </item>
    <item>
      <title>Q&amp;A from Webinar - Solve Parallelism with Intel® Parallel Studio</title>
      <description><![CDATA[ <strong>Q&amp;A from Webinar - <a href="https://event.on24.com/event/36/88/3/rt/1/index.html">Solve Parallelism with Intel® Parallel Studio</a>. <br /><br />Q1. Can I use Fortran language with Intel(R) Parallel Studio? <br /></strong>
<blockquote>
<p>A: No. The Intel Parallel Studio is targeting Microsoft* Visual C++ developers.<br />If your program uses Fortran language, you should use the following tools:</p>
<ul>
<li>Intel Visual Fortran Compiler Professional Edition for Windows</li>
<li>Intel VTune™ Analyzer with Intel Thread Profiler for Windows</li>
<li>Intel Thread Checker for Windows*</li>
</ul>
Please visit <a href="http://software.intel.com/en-us/articles/intel-software-evaluation-center/">http://software.intel.com/en-us/articles/intel-software-evaluation-center/</a> for more information.</blockquote>
<br /><strong>Q2. Can you tell Intel Parallel Inspector to ignore the system DLLs? They probably don't contain threading errors, wouldn't you expect?  </strong><br />
<blockquote>A: Yes, but most are already ignored by default. <br /><br />Suppression can be used to suppress any errors reported in any DLL, source file, or function.  See the Defining Private Suppression Rules topic in the Parallel Inspector help for more information.</blockquote>
<br /><strong>Q3. What versions of Microsoft* Visual Studio* will the Intel Parallel Studio run in?  <br /></strong>
<blockquote>A: The Intel Parallel Studio can be used with Visual Studio 2005* standard edition or above, or Visual Studio 2008* standard edition or above.<br /><br />Please see the Release Notes at <a href="http://software.intel.com/sites/products/documentation/studio/studio/en-us/2009/start/release_notes_studio.pdf">http://software.intel.com/sites/products/documentation/studio/studio/en-us/2009/start/release_notes_studio.pdf</a> for complete information.</blockquote>
<br /><strong>Q4. Can you comment on the compatibility of OpenMP* with Windows Threads?</strong><br />
<blockquote>A: OpenMP and the native thread can be used together. <br />Be aware of the possibility of thread over subscription. You can control the number of threads for OpenMP to use from an environment variable or programmatically in the code.<br /><br />There's one nice article talking about this at <a href="http://software.intel.com/en-us/articles/intel-threading-building-blocks-openmp-or-native-threads/">http://software.intel.com/en-us/articles/intel-threading-building-blocks-openmp-or-native-threads/</a></blockquote>
<br /><strong>Q5.  I have tried running Intel Parallel Amplifier on an interactive Embedded system and it quickly fails to start. What I would like to do is profile execution for a period of time and then get an analysis. Any tips for how to work with this type of application? </strong><br />
<blockquote>A: If the program can run within Microsoft Visual Studio, the Intel Parallel Amplifier should work. It might be a bug. <br />To report any issue about Intel Parallel Studio, please visit our Studio Forum <a href="http://software.intel.com/en-us/forums/intel-parallel-studio/">http://software.intel.com/en-us/forums/intel-parallel-studio/</a>.</blockquote>
<br /><strong>Q6. Can Intel Parallel Studio help us detect cache thrashing, which is one of the bottlenecks of multi-core programming?</strong><br />
<blockquote>A: No. <br />Please use the Intel VTune<sup> </sup>Analyzer with Intel Thread Profiler for Windows*.</blockquote>
<strong><br />Q7.  What about cross-compiling and MacOS X* &amp; Linux* support?</strong><br />
<blockquote>A: No, that is not supported by the Intel Parallel Studio. <br /><br />The cross compiling for a different architecture of the same OS is supported, but it's not supported for a different OS. i.e. you can cross compiling the application for Intel 64 Windows on an IA-32 Windows system. you can not cross compiling the application for Linux or MacOS X on a Windows system. <br /><br />For Linux, please use the tools below: <br />
<li>Intel C++ Compiler Professional Edition for Linux</li>
<li>Intel VTune Analyzer for Linux </li>
<li>Intel Thread Checker for Linux</li>
<p><br />For MacOS X, we currently have only the following:</p>
<li>Intel C++ Compiler Professional Edition for MacOS X</li>
<p><br />Please visit <a href="http://software.intel.com/en-us/articles/intel-software-evaluation-center/">http://software.intel.com/en-us/articles/intel-software-evaluation-center/</a> for more information.</p>
</blockquote>
<br /><strong>Q8. Can I compile the MFC-based applications with Intel® Parallel Composer? </strong><br />
<blockquote>A: Yes if the code does not use "Attribute" or "CLR".  <br />The Intel Parallel Studio only supports native or unmanaged C/C++ program.</blockquote>
<br /><strong>Q9.  Does the Intel Parallel Studio encompass all the features of Intel VTune Analyzer? Can Intel Parallel Studio replace Intel VTune Analyzer?</strong><br />
<blockquote>A: No. <br />Please see this article <a href="http://software.intel.com/sites/products/collateral/studio/Amplifier_VTune_Comparison.pdf" title="http://software.intel.com/sites/products/collateral/studio/Amplifier_VTune_Comparison.pdf">http://software.intel.com/sites/products/collateral/studio/Amplifier_VTune_Comparison.pdf</a> for detail.</blockquote>
<br /><strong>Q10. Surely your product targets the Intel architecture how about other brand CPUs. When we use your product it will also work in other CPUs without problem and still get the same performance?</strong><br />
<blockquote>A: Yes, the Intel Parallel Studio can be used on any other brand CPUs with Windows OS. The Intel Parallel Composer has options for targeting other CPUs and the performance should be better.
<p> </p>
Please see this KB for detail - <a href="http://software.intel.com/en-us/articles/performance-tools-for-software-developers-intel-compiler-options-for-sse-generation-and-processor-specific-optimizations/">http://software.intel.com/en-us/articles/performance-tools-for-software-developers-intel-compiler-options-for-sse-generation-and-processor-specific-optimizations/</a></blockquote>
<br /><strong>Q11.  How long does it take to finish the Intel Parallel Amplifier and Intel Parallel Inpector analysis on a large application (32 MB)? I have been running one for hours, and it is still going.</strong><br />
<blockquote>A: The Intel Parallel Amplifier should not take too long like Parallel Inspector. Be sure to report problems to our Forum <a href="http://software.intel.com/en-us/forums/intel-parallel-studio/">http://software.intel.com/en-us/forums/intel-parallel-studio/</a>.</blockquote>
<br /><strong>Q12. What happens if you have an application with languages other than C++? Can you still analyze it with Parallel Studio? What if a deadlock happens in your Visual Basic* code, for example?</strong><br />
<blockquote>A: No, the Intel Parallel Studio only supports C++ native code. <br />If the deadlock happens in a DLL of Visual Basic, the Parallel Inspector can not find it.</blockquote>
<br /><strong>Q13.  Some operating systems like AIX* allow application developers to have threads to be rescheduled on the same processor -- does this tool presently (or a later version may) capture processor cache hits/misses?</strong><br />
<blockquote>A: Not right now.  A future version of the Parallel Studio will support cache miss detection.<br /><br />Please use the Intel VTune Analyzer with Intel Thread Profiler for Windows.</blockquote> ]]></description>
      <link>http://software.intel.com/en-us/articles/qa-from-webinar-solve-parallelism-with-intel-parallel-studio</link>
      <pubDate>Wed, 01 Jul 2009 15:25:29 -0700</pubDate>
      <comments>http://software.intel.com/en-us/articles/qa-from-webinar-solve-parallelism-with-intel-parallel-studio#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/qa-from-webinar-solve-parallelism-with-intel-parallel-studio</guid>
      <category>Intel® Parallel Amplifier Knowledge Base</category>
      <category>Intel® Parallel Composer Knowledge Base</category>
      <category>Intel® Parallel Inspector Knowledge Base</category>
    </item>
    <item>
      <title>Static Analysis and Intel® C++ Compilers Webinar Q&amp;A</title>
      <description><![CDATA[ <p>Q&amp;A from <a href="https://event.on24.com/event/36/88/3/rt/1/index.html."><strong>Webcast</strong></a>: The webinar "Static Analysis and Intel® C++ Compilers" was presented by Dmitry Petunin, June 2<sup>nd</sup> 2009, as part of our technical webinar series about Multithreading tools and techniques. The following questions were selected from the list of questions and answers generated by this web cast, and may be useful to other developers as reference.</p>
<p><b>Q:  </b><b>Can the Static Analysis capability verify the correctness of say a </b><b>Producer-Consumer problem with multiple threads?<br /></b><b>A:  </b>No, we do not have any specific checks for such a problem.<br /><b><br />Q:  </b><b>Does parallel composer run on Linux?<br /></b><b>A:  </b>No, the Intel® Parallel Composer is presently for Windows* only. But, you can<br />     use the latest "Intel® C++ Compiler Professional Edition" for Linux.<br /><b><br />Q:  </b><b>What is meant by "Compiler's interprocedural analysis capability, and </b><b>can correctness of <br />     multiprocess applications using synchronization </b><b>primitives in shared memory be checked?<br /></b><b>A:  </b>The inter-procedural analysis capability enables the compiler to analyze the code to determine if there is benefit in <br />     optimizations such as: inline function expansion, inter-procedural constant propagation, passing arguments in <br />     registers, dead code elimination, multi-file optimizations etc. Also, no, the correctness of multiprocess applications <br />     using synchronization primitives in shared memory cannot be checked.<br /><b><br />Q:  If "parallel lint" does not report on data dependencies in my program, </b><b>does it mean that I can be <br />     sure that my program is free of data </b><b>dependencies?<br /></b><b>A:  </b>No. In addition, the theorem of Heodel says that this task cannot be resolved at all.<br /><b><br />Q:  Is parallel lint a part of Compiler 11.1?<br /></b><b>A:  </b>Yes. In Compiler 11.1, there is a complete source checker tool in addition too.<br /><b><br />Q:  Hello, I am sort of confused! When we talk of compiler, is it the one</b><b> integrated in the Parallel Studio <br />     or the stand alone compiler version 11 </b><b>or both?<br /></b><b>A:  </b>Parallel lint is available in both Intel® Parallel Composer (which is available as an individual product, and is also <br />     integrated as a component of Intel® Parallel Studio) and the standalone Intel® C++ Compiler Professional Edition <br />     product.<br /><b><br />Q:  Are there plans to support Intel®Threading Building Blocks with Parallel </b><b>lint?<br /></b><b>A:  </b>Presently, there is no announced plans at this time, but will update as soon as any plan for this is available.<br /><b><br />Q:  Is Parallel Lint available for Fortran? Any plan to support Fortran if not </b><b>supported right now?<br /></b><b>A:  </b>Yes, it is available</p>
<ul>
<li>Useful links:<br />
<p><a href="http://software.intel.com/en-us/articles/parallel-lint/">http://software.intel.com/en-us/articles/parallel-lint/</a></p>
     </li>
</ul>
<p> </p>
<p> </p>
<p> </p>
<p> </p> ]]></description>
      <link>http://software.intel.com/en-us/articles/static-analysis-and-intel-c-compilers-webinar-qa</link>
      <pubDate>Wed, 01 Jul 2009 15:19:42 -0700</pubDate>
      <comments>http://software.intel.com/en-us/articles/static-analysis-and-intel-c-compilers-webinar-qa#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/static-analysis-and-intel-c-compilers-webinar-qa</guid>
      <category>Intel® Parallel Amplifier Knowledge Base</category>
      <category>Intel® Parallel Composer Knowledge Base</category>
      <category>Intel® Parallel Inspector Knowledge Base</category>
    </item>
    <item>
      <title>Easy Ways to Solve Parallel Performance Challenges Webinar Questions and Answers</title>
      <description><![CDATA[ <p><b>During "Easy Ways to Solve Parallel Performance Challenges" webinar presented by Gary Carleton, April 21, we received the following questions and we thought we would share them with you: </b></p>
<p> </p>
<p><b>Q. What is the website for the Webinars?</b></p>
<p>A. Visit the webinars website <a href="https://event.on24.com/event/36/88/3/rt/1/index.html">https://event.on24.com/event/36/88/3/rt/1/index.html</a></p>
<p> </p>
<p><b>Q. </b><b>Does Intel® Parallel Studio provide any support for Microsoft Visual* Studio* 2003?</b></p>
<p>A. Microsoft* Visual Studio* 2005 and 2008 are supported. Please check: <a href="http://software.intel.com/sites/products/documentation/studio/studio/en-us/2009/start/release_notes_studio.pdf">http://software.intel.com/sites/products/documentation/studio/studio/en-us/2009/start/release_notes_studio.pdf</a></p>
<p> </p>
<p><b>Q. What operating systems are supported?</b></p>
<p class="Default">A. Please check system requirements here: <a href="http://software.intel.com/sites/products/documentation/studio/studio/en-us/2009/start/release_notes_studio.pdf">http://software.intel.com/sites/products/documentation/studio/studio/en-us/2009/start/release_notes_studio.pdf</a></p>
<p class="Default"> </p>
<p class="Default"><b>Q. Does code need to be compiled with the Intel® Compiler or can Microsoft Compilers be used?</b></p>
<p>A. You can use either the Intel® or the Microsoft compilers.</p>
<p class="Default"> </p>
<p class="Default"><b>Q. Does code need to be compiled in any special way to enable source-code views? </b></p>
<p>A. To provide accurate performance data, the Intel(R) Parallel Amplifier requires debug information for binary files it analyzes. Generating debug information should not affect compiler optimizations but the linker may turn off some default optimizations, therefore verify that the optimization switches are enabled. If debug information does not exist, the Amplifier may not unwind the call stack correctly. Note, in some cases the finalization of results for modules without debug information can take much longer than the finalization with debug information present.</p>
<p class="Default"> </p>
<p class="Default"><b>Q. Do I need to use Intel® VTune<sup>TM</sup> Performance analyzer or Intel® Thread Checker etc. any more since I found that Intel® Parallel Studio can do what VTune analyzer always do?</b></p>
<p>A. VTune analyzer provides additional functionality. For example, VTune Event Based Sampling (the ability to track processor level events) is one such area. The Intel(R) Parallel Amplifier, an Intel(R) Parallel Studio tool, provides information on the performance of your code. Use the Parallel Amplifier to analyze the following types of performance issues in your threaded applications:</p>
<p>- Identify the most time-consuming (hot) functions</p>
<p>- Locate sections of code that do not effectively utilize available processor time</p>
<p>- Determine the best sections of code to optimize for sequential performance and for threaded performance</p>
<p>- Locate synchronization objects that affect the program performance</p>
<p>- Find whether, where, and why your program spends time on input/output operations</p>
<p>- Identify and compare the performance impact of different synchronization methods, different numbers of threads, or different algorithms</p>
<p class="Default"> </p>
<p class="Default"><b>Q. Are there many overlapped functions between Intel® Parallel Studio and other threading tools?</b></p>
<p class="Default">A. There are some. Moreover, Parallel Studio is built on the latest technology (PIN) for faster analysis. Also, Parallel Inspector includes memory checking in addition to "Thread Checking". Parallel Amplifier has Statistical Call Graph for profiling your applications with low overhead to detect where time is spent in your application. See <a href="http://software.intel.com/sites/products/collateral/studio/Amplifier_VTune_Comparison.pdf">Intel® Parallel Amplifier vs. Intel® VTune<sup>TM</sup> Analyzer Comparison</a>.</p>
<p class="Default"> </p>
<p class="Default"><b>Q. Does the analysis done by Intel® Parallel Amplifier go to the level of identifying contention on global heap (so not just locking in the user's code, but also deep inside libraries)?</b></p>
<p class="Default">A. The data collector interrupts a process, collects samples of all active instruction addresses, and reproduces a call sequence (stack) upon each sample. Intel® Parallel Amplifier can identify contention on the system libraries. If you do not have the source code for those libraries, you will see just the library name and you can walk the stack back up to the New/Malloc/etc in your code.</p>
<p class="Default"> </p>
<p class="Default"><b>Q. Can the Intel® Parallel Amplifier attach to a running process?</b></p>
<p class="Default">A. Currently, this is not a supported functionality. However, a running process can be analyzed via VTune analyzer.</p>
<p class="Default"> </p>
<p class="Default"><b>Q. Does Intel® Parallel Studio support Linux?</b></p>
<p class="Default">A. Currently Intel® Parallel Studio is designed for Microsoft Windows*/Microsoft Visual* Studio* only. We do have an existing line of tools, including Intel® VTune analyzer, Intel® Thread Checker, and the Intel® Compiler, which run on Linux*.</p>
<p class="Default"> </p>
<p class="Default"><b>Q. Is it possible to see the specific function in source code in concurrency view? </b></p>
<p>A. The source/Assembly window displays accurate information provided that:</p>
<p>- Your code is compiled with the debug information and debug information is written correctly in the debug information file (or symbol file).</p>
<p>- The source code file exists.</p>
<p>If there is no correct debug information, or symbol file is unavailable, the assembly data may be incorrect. In this case, the Amplifier uses heuristics to define function boundaries in the binary module.</p>
<p class="Default"> </p>
<p class="Default"><b>Q. Does Amplifier support Open MP3.0 task, Win32 API and Intel® Threading Building Block?</b></p>
<p class="Default">A. Yes these are supported. Note for Intel® Threading Building Block, the Intel® Parallel Studio analysis obtained will only be thread based (not task based).</p>
<p class="Default"> </p>
<p class="Default"><b>Q. Does Intel® Parallel Studio work with the Intel® Fortran compiler?</b></p>
<p class="Default">A. Intel® Parallel Studio is designed and targeted and tested for C++ software.    The analysis is based on the binary not the source, how well it works with Fortran may vary.<b> </b>Please be aware that in addition to Intel Parallel Studio product, Intel offers a HPC line of products that offer full support for Fortran.  This includes the Intel® Fortran compiler of course, but also Intel® VTune Performance Analyzer and Intel® Thread Checker.  The Intel Math Kernel Library also includes full Fortran interfaces to BLAS, LAPACK, FFT and other common numerical algorithms.  Most Fortran developers use the HPC line of Intel products.<b></b></p> ]]></description>
      <link>http://software.intel.com/en-us/articles/easy-ways-to-solve-parallel-performance-challenges-webinar-questions-and-answers</link>
      <pubDate>Wed, 01 Jul 2009 11:34:27 -0700</pubDate>
      <comments>http://software.intel.com/en-us/articles/easy-ways-to-solve-parallel-performance-challenges-webinar-questions-and-answers#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/easy-ways-to-solve-parallel-performance-challenges-webinar-questions-and-answers</guid>
      <category>Intel® Parallel Amplifier Knowledge Base</category>
      <category>Intel® Parallel Composer Knowledge Base</category>
      <category>Intel® Parallel Inspector Knowledge Base</category>
    </item>
    <item>
      <title>The Key to Scaling Applications for Multicore Webinar Questions and Answers</title>
      <description><![CDATA[ <p><b>During "The Key to Scaling Applications for Multicore" webinar, May, 5, by Paul Petersen and Mark Davis we received the following questions and we thought we would share them with you: </b></p>
<p> </p>
<p><b>Q. </b><b>Does Intel® Parallel Studio provide any support for Microsoft Visual* Studio* 2003?</b></p>
<p>A. Microsoft* Visual Studio* 2005 and 2008 are supported. Please check system requirements at: <a href="http://software.intel.com/sites/products/documentation/studio/studio/en-us/2009/start/release_notes_studio.pdf">http://software.intel.com/sites/products/documentation/studio/studio/en-us/2009/start/release_notes_studio.pdf</a></p>
<p> </p>
<p><b>Q. What operating systems are supported?</b></p>
<p>A. Please check requirements at: <a href="http://software.intel.com/sites/products/documentation/studio/studio/en-us/2009/start/release_notes_studio.pdf">http://software.intel.com/sites/products/documentation/studio/studio/en-us/2009/start/release_notes_studio.pdf</a></p>
<p> </p>
<p><b>Q. </b><b>Where does the profiling information come from? Is it an Intel tool?</b></p>
<p>A. Yes, you can use Intel(R) Parallel Amplifier to find hotspots in your program/application. The Intel(R) Parallel Amplifier, an Intel(R) Parallel Studio tool, provides information on the performance of your code. Use the Parallel Amplifier to analyze the following types of performance issues in your threaded applications:</p>
<p>- Identify the most time-consuming (hot) functions</p>
<p>- Locate sections of code that do not effectively utilize available processor time</p>
<p>- Determine the best sections of code to optimize for sequential performance and for threaded performance</p>
<p>- Locate synchronization objects that affect the program performance</p>
<p>- Find whether, where, and why your program spends time on input/output operations</p>
<p>- Identify and compare the performance impact of different synchronization methods, different numbers of threads, or different algorithms</p>
<p> </p>
<p><b>Q. How do I get the material to the previous seminars and technical sessions I missed?</b></p>
<p>A. Go to <a href="http://www.intel.com/go/parallel">www.intel.com/go/parallel</a> under "Related Links" on the right and click "Free, on-demand parallelism webinars".</p>
<p> </p>
<p><b>Q. Is there any Intel software tool for parallelism under Linux platform?</b></p>
<p class="Default">A. Yes. We do have an existing line of tools, including Intel® VTune analyzer, Intel® Thread Checker, and the Intel® Compilers, which run on Linux.</p>
<p><b>                                                   </b></p>
<p><b>Q. </b><b>What is the difference between Intel® VTune<sup>TM</sup> analyzer/Intel® Thread Checker and Intel® Parallel Studio? Or can they be used in a good combination somehow?</b></p>
<p class="Default">A. Yes. Moreover, Parallel Studio is built on the latest technology (PIN) for faster analysis. Also, Parallel Inspector includes memory checking in addition to "Thread Checking". Parallel Amplifier has Statistical Call Graph for profiling your applications with low overhead to detect where time is spent in your application. See <a href="http://software.intel.com/sites/products/collateral/studio/Amplifier_VTune_Comparison.pdf">Intel® Parallel Amplifier vs. Intel® VTune<sup>TM</sup> Analyzer Comparison</a>.</p>
<p> </p>
<p><b>Q. </b><b>Have you had any experience using Intel Parallel Studio with large programs (500,000 lines)? I have used VTune analyzer in the past and it was difficult to use because the size of the program I was working with was large. </b></p>
<p class="Default">A. The Intel® Parallel Studio offers a more streamlined and simplified usage model for hotspot analysis similar to VTune analyzer. The sampling methodology used is the same. This means that with big applications you have to watch for sampling overhead and the influence this has on your application sampling. You may want to consider focusing your sampling on application subsets and individual application components rather than the whole application and use a stepped approach or an approach with a smaller input data stream. All this depends a bit on the exact architecture of your application. See <a href="http://software.intel.com/sites/products/collateral/studio/Amplifier_VTune_Comparison.pdf">Intel® Parallel Amplifier vs. Intel® VTune<sup>TM</sup> Analyzer Comparison</a>.</p>
<p class="Default"> </p>
<p class="Default"><b>Q. Is </b><b>Intel® Threading Building Blocks (TBB) </b><b>open source, and where may I get the source for OpenMP*?</b></p>
<p>A. Intel® Threading Building Blocks (TBB) source files can be downloaded here: <a href="http://www.threadingbuildingblocks.org/download.php">http://www.threadingbuildingblocks.org/download.php</a>. Intel's implementation of OpenMP* is not an open source project.</p>
<p class="Default"> </p>
<p class="Default"><b>Q. If Intel® Parallel Advisor Lite finds the hotspot in the code, do we still have a need to use the Intel® Parallel Amplifier for finding hotspot? </b></p>
<p><b>A.</b> Intel® Parallel Advisor Lite works on the serial portions of your application.  It leverages Intel® Parallel Amplifier's hot spot analysis to help identify likely areas in your serial application to experiment with parallelism - in particular, we recommend using Parallel Amplifier's "Top Down" view where one would work up the call tree to find a likely site.</p>
<p>If all you need are hot spots, then Parallel Advisor Lite has already found them for you by leveraging Parallel Amplifier.  However, once you have added parallelism, Parallel Amplifier can help you tune the parallelism using the Concurrency analysis to see where the threading is being used effectively, and the Locks and Waits analysis to determine if the program is wasting resources in synchronization sequences.</p> ]]></description>
      <link>http://software.intel.com/en-us/articles/the-key-to-scaling-applications-for-multicore-webinar-questions-and-answers</link>
      <pubDate>Wed, 01 Jul 2009 11:30:10 -0700</pubDate>
      <comments>http://software.intel.com/en-us/articles/the-key-to-scaling-applications-for-multicore-webinar-questions-and-answers#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/the-key-to-scaling-applications-for-multicore-webinar-questions-and-answers</guid>
      <category>Intel® Parallel Amplifier Knowledge Base</category>
      <category>Intel® Parallel Composer Knowledge Base</category>
      <category>Intel® Parallel Inspector Knowledge Base</category>
    </item>
    <item>
      <title>Q&amp;A from Webcast “The Simplifying Parallelism Implementation with Intel® Threading Building Blocks”</title>
      <description><![CDATA[ <p><b>Q&amp;A from <a target="_blank" href="https://event.on24.com/event/36/88/3/rt/1/index.html" title="Webcast">Webcast </a></b><b>"</b><b>The Simplifying Parallelism Implementation with Intel® Threading Building Blocks" presented by Michael D'Mello on 5/26/2009</b><strong></strong></p>
<p><strong>Q:</strong> Is the thread-to-core ratio hardwired 1:1 in Intel® Threading Building Blocks (Intel® TBB) or is it configurable?<br /><strong>A:</strong> The number of worker threads in Intel TBB thread pool equals the number of logical cores by default. The default number can be changed: you can specify the desired number of threads as a parameter to the constructor of Intel TBB initialization object task_scheduler_init. Please visit <a href="http://www.threadingbuildingblocks.org/documentation.php">http://www.threadingbuildingblocks.org/documentation.php</a> and read the related chapter in the tutorial for the complete example.</p>
<p><strong>Q:</strong> What if my code contains data-dependency? Can Intel® Threading Building Blocks (Intel® TBB) detect it? <br /><strong>A:</strong> No, Intel TBB doesn't analyze code to detect data dependencies. Intel TBB is a library that provides generic algorithms and data structures that simplify threading and it is recommended that all data dependencies are known before applying threading. Data races arguably constitute the most commonly encountered errors in parallel code, and this type of error occurs when data dependencies are not properly handled by the programmer. Intel® Parallel Inspector is a developer oriented tool for finding data races in threaded applications. </p>
<p><strong>Q:</strong> What is the advantage of re-implementing ChangeArray(A) as a functor? <br /><strong>A:</strong> Most of Intel® Threading Building Blocks (Intel® TBB) algorithms take a functor object as a parameter. To take advantage of easy threading and "future proof" parallelism, which Intel TBB algorithms deliver, a developer should implement the logic of a parallel task as a functor. A less highly touted feature of Intel TBB is that the library provides excellent mechanisms to handle task-based parallelism while still emphasizing the data parallelism model. For simple loops, one may well question the advantages of using Intel TBB as opposed to other threading models. However, for more complicated loops, Intel TBB seems to provide much more flexibility than other threading models. One example of this is to consider iterating over a collection of items which is not indexed by an integer.  </p>
<p><strong>Q:</strong> What do you mean by a "warmer" task describing the effectiveness of task scheduling performed by Intel® Threading Building Blocks (Intel® TBB)? <br /><strong>A:</strong> A warmer task is a task which is most likely to still reside in the cache. Executing "warm" tasks while they are still in the cache is more efficient than executing "cold" tasks whose data will have to be fetched into the cache first. Intel TBB task scheduler's approach to scheduling tasks favors "warm" tasks over "cold" tasks which makes scheduling highly efficient.</p>
<p><strong>Q:</strong>  Just thinking about use of cache_aligned_allocator vs. scalable_allocator - does Intel® Threading Building Blocks (Intel® TBB) provide an API for obtaining any metrics about the processor caches/cache-lines etc? <br /><strong>A:</strong> No. That type of information can be examined using the Intel® VTune Performance Analyzer, which can monitor processor-specific events including cache misses and other cache-related activity.</p>
<p><strong>Q:</strong> Your examples are for the for-loop, is it also true for while-loop? How do I parallelize non-indexed loops with Intel® Threading Building Blocks (Intel® TBB)?<br /><strong>A:</strong> Intel TBB implements several template classes and functions to simplify threading of non-indexed loops. For example, parallel_do (for while-loops) and pipeline (data flow pipelines). Please see the Intel TBB documentation to learn more: <a href="http://www.threadingbuildingblocks.org/documentation.php">http://www.threadingbuildingblocks.org/documentation.php</a>.</p>
<p><strong>Q:</strong> Does Intel® Parallel Studio require a C++ compiler or is one included?<br /><strong>A:</strong> Intel® C++ Compiler is included in Intel Parallel Studio; it is one of the components of Intel® Parallel Composer.</p>
<p><strong>Q:</strong> Which versions of Microsoft Visual Studio* are supported by Intel® Parallel Studio?<br /><strong>A:</strong> Microsoft Visual Studio* 2005 and 2008.</p>
<p><strong>Q:</strong> What do I gain when I compile the source code with Intel® C++ Compiler compared to Microsoft's compiler? If there is a gain, several projects are impossible to convert to Intel® Compiler, such as COM-projects using attributes. And projects using BOOST are cumbersome since I need to compile BOOST*-versions for the Intel Compiler. It would be great if Intel provided already compiled BOOST library dlls and libs?<br /><strong>A:</strong> That is correct. COM attributes are not supported right now. We don't provide special builds of 3<sup>rd</sup> party libraries but it should be easy to compile them with Intel C++ Compiler. Also, binaries built with Intel C++ Compiler are fully compatible with binaries built with Microsoft's compiler so both types of binaries can be safely mixed within one application. Therefore, you should be able to link object files created with the Intel compiler and your source code with the BOOST libraries created with the Microsoft compiler.  If you have any issues compiling BOOST with Intel C++ Compiler, please let us know via a forum: <a href="http://software.intel.com/en-us/forums">http://software.intel.com/en-us/forums</a>.</p>
<p><strong>Q:</strong> I know Intel® Threading Building Blocks (Intel® TBB) works with AMD processors, will Intel® Parallel Studio?<br /><strong>A:</strong> Intel® Parallel Studio runs on platforms with an IA-32 or Intel® 64 architecture processor supporting the Intel® Streaming SIMD Extensions 2 (Intel® SSE2) instructions (Intel® Pentium 4 processor or later, or compatible non-Intel processor).</p>
<p><strong>Q:</strong> Does Intel® Parallel Studio work under Microsoft Windows* 7 and Visual Studio* 10?<br /><strong>A:</strong> The first releases of Intel Parallel Studio have not been validated against Microsoft Windows* 7 and Visual Studio* 10, since neither is a released product. We will, of course, move forward closely aligned with the Microsoft roadmap and validate those platforms in future releases.</p>
<p><strong>Q:</strong> Is Intel® Threading Building Blocks (Intel® TBB) aware of dynamic logical partitioning activity like 2 cores out of say 8 being pulled out during application-runtime?<br /><strong>A:</strong> Intel TBB does not have any dynamic mechanisms to detect such situations. The number of threads is selected during initialization of the Intel TBB task scheduler. The user either chooses how many threads will be created or allows Intel TBB to take the default (which is 1 thread per logical core). For the situation you mention, provided the logical partitioning mechanism does not interrupt normal code execution (due to some system level dependency), Intel TBB code will continue to execute with a thread pool of size determined at time of the scheduler initialization.</p>
<p> </p> ]]></description>
      <link>http://software.intel.com/en-us/articles/qa-from-webcast-the-simplifying-parallelism-implementation-with-intel-threading-building-blocks</link>
      <pubDate>Wed, 01 Jul 2009 11:00:24 -0700</pubDate>
      <comments>http://software.intel.com/en-us/articles/qa-from-webcast-the-simplifying-parallelism-implementation-with-intel-threading-building-blocks#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/qa-from-webcast-the-simplifying-parallelism-implementation-with-intel-threading-building-blocks</guid>
      <category>Intel® Parallel Amplifier Knowledge Base</category>
      <category>Intel® Parallel Composer Knowledge Base</category>
      <category>Intel® Parallel Inspector Knowledge Base</category>
    </item>
  </channel></rss>