<?xml version="1.0" encoding="UTF-8"?>
<!-- Generated on Sat, 26 May 2012 03:47:02 -0700 -->
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <atom:link href="http://software.intel.com/en-us/articles/intel-parallel-amplifier-kb/landing-links/feed" rel="self" type="application/rss+xml" />
    <title>Intel Software Network articles Feed</title>
    <link>http://software.intel.com/en-us/articles/intel-parallel-amplifier-kb/landing-links/feed</link>
    <description></description>
    <language>en-us</language>
    <item>
      <title>Recognize User Synchronization Objects in Intel® Parallel Amplifier</title>
      <description><![CDATA[ <p>Many developers wrote their own primitives in their code, but Intel® Parallel Amplifier (<strong>Locks and Waits</strong>) recognizes Windows* defined synchronization objects only, like as event, mutex, semaphore and critical section, etc.</p>
<p>Intel® Parallel Amplifier provides libittnotify.dll/libittnotify.h which can notify User Synchronization Objects to Intel® Parallel Amplifier at runtime.</p>
<p>Here is a test case: the function is used by many threads, and each thread computes value in local variable "<strong>lpot</strong>" in loop, and accumulates local value into global variable "<strong>pot</strong>"</p>
<p>1)  for( i=start; i&lt;end; i++ ) {</p>
<p>      for( j=0; j&lt;i-1; j++ ) {</p>
<p>        distx = pow( (r[0][j] - r[0][i]), 2 );</p>
<p>        disty = pow( (r[1][j] - r[1][i]), 2 );</p>
<p>        distz = pow( (r[2][j] - r[2][i]), 2 );</p>
<p>        dist = sqrt( distx + disty + distz );     </p>
<p>       <strong>lpot += 1.0 / dist;</strong></p>
<p>      }</p>
<p>   }</p>
<p> </p>
<p>   EnterCriticalSection(&amp;cs);</p>
<p>      <strong>pot += lpot;</strong></p>
<p>   LeaveCriticalSection(&amp;cs);</p>
<p> </p>
<p>In this implementation, time comsumption of sys-obj "<strong>CRITICAL_SECTION cs;</strong>" will be analyzed in the result of Intel® Parallel Amplifer</p>
<p><br />2) for( i=start; i&lt;end; i++ ) {</p>
<p>      for( j=0; j&lt;i-1; j++ ) {</p>
<p>        distx = pow( (r[0][j] - r[0][i]), 2 );</p>
<p>        disty = pow( (r[1][j] - r[1][i]), 2 );</p>
<p>        distz = pow( (r[2][j] - r[2][i]), 2 );</p>
<p>        dist = sqrt( distx + disty + distz );     </p>
<p>        lpot += 1.0 / dist;</p>
<p>      }</p>
<p>   }</p>
<p> </p>
<p>  while (!spin) {</p>
<p>          spin = 1;</p>
<p>          pot += lpot;</p>
<p>   }</p>
<p>   spin = 0;</p>
<p> </p>
<p>In this implementation, time consumption of user's "spin" will <strong>NOT</strong> be analyzed</p>
<p> </p>
<p>3)  for( i=start; i&lt;end; i++ ) {</p>
<p>      for( j=0; j&lt;i-1; j++ ) {</p>
<p>        distx = pow( (r[0][j] - r[0][i]), 2 );</p>
<p>        disty = pow( (r[1][j] - r[1][i]), 2 );</p>
<p>        distz = pow( (r[2][j] - r[2][i]), 2 );</p>
<p>        dist = sqrt( distx + disty + distz );     </p>
<p>        lpot += 1.0 / dist;</p>
<p>      }</p>
<p>   }</p>
<p>  </p>
<p>   <strong>sync_prepare(&amp;spin);</strong></p>
<p>   while (!spin) {</p>
<p>          spin = 1;</p>
<p>         <strong> sync_acquired(&amp;spin);</strong></p>
<p>          pot += lpot;</p>
<p>   }</p>
<p>   <strong>sync_releasing (&amp;spin);</strong></p>
<p>   spin = 0;</p>
<p> </p>
<p>In this implementation, time consumption of user's "spin" will be analyzed</p>
<p> </p>
<p>Note that the user can get libittnotify's APIs as below</p>
<p> </p>
<p>#include &lt;ittnotify.h&gt;</p>
<p>......</p>
<p>typedef void (*itt_notify_sync_prepare)(void *);</p>
<p>typedef void (*itt_notify_sync_acquired)(void *);</p>
<p>typedef void (*itt_notify_releasing)(void *);</p>
<p> </p>
<p>HMODULE hMod;</p>
<p>itt_notify_sync_prepare sync_prepare;</p>
<p>itt_notify_sync_acquired sync_acquired;</p>
<p>itt_notify_releasing sync_releasing;</p>
<p>......</p>
<p>  hMod = LoadLibrary("libittnotify.dll");</p>
<p>      </p>
<p>   sync_prepare = (itt_notify_sync_prepare) GetProcAddress(hMod, "__itt_notify_sync_prepare");</p>
<p>   sync_acquired = (itt_notify_sync_acquired) GetProcAddress(hMod, "__itt_notify_sync_acquired");</p>
<p>   sync_releasing = (itt_notify_releasing) GetProcAddress(hMod, "__itt_notify_sync_releasing");</p>
<p> </p>
<p><br />Finally the user can get Intel® Parallel Amplifier (Locks and Waits) Result -<br /><br /><img src="http://software.intel.com/file/21815" alt="ittnotify.bmp" title="ittnotify.bmp" /></p> ]]></description>
      <link>http://software.intel.com/en-us/articles/recognize-user-synchronization-objects-in-intel-parallel-amplifier/</link>
      <pubDate>Mon, 30 Jan 2012 09:00:00 -0800</pubDate>
      <comments>http://software.intel.com/en-us/articles/recognize-user-synchronization-objects-in-intel-parallel-amplifier/#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/recognize-user-synchronization-objects-in-intel-parallel-amplifier/</guid>
      <category>Intel® Parallel Amplifier Knowledge Base</category>
    </item>
    <item>
      <title>Cannot use user-defined hotspots in command line?</title>
      <description><![CDATA[ <p>Some VTune<sup>(TM)</sup> Amplifier XE 2011 users prefer to collect performance data using command line interface. The VTune Amplifier XE 2011 provides command line interface to support users to collect performance data using command line interface but just for predefined analysis types. However, with the release of the VTune™ Amplifier XE 2011 Update 3, you can now collect hardware-based sampling data by configuring specific events via the command-line interface such as:</p>
<p> </p>
<p>Case 1)      Hardware PMU event-based sampling analysis (User-defined)</p>
<p>Case 2)      User-mode sampling and tracing analysis (User-defined)</p>
<p> </p>
<p>For case 1), please refer to "<a href="http://software.intel.com/en-us/articles/event-configuration-from-the-command-line/">Event Configuration from the Command Line</a>" article to know How-to do this analysis. The article documents information about configuring those events.</p>
<p> </p>
<p>For case 2), (User-mode sampling and tracing analysis)  - when the user runs user-mode sampling and tracing analysis in command line and requests to change sample interval from 10s (default value) to 20s, with "-collect-with runss", the user <strong>should</strong> specify collectSamplesMode explicitly (with either "stack" or "nostack"). Here is an example:</p>
<pre name="code" class="shell"><br />$ amplxe-cl -collect-with runss -knob interval=20<strong> </strong>--knob collectSamplesMode=stack -- ./primes.icc <br />Determining primes from 1 - 100000 <br />Found 9592 primes <br />Using result path `/home/peter/problem_report/r012runss' Executing actions 75 % <br />Generating a report <br />Summary ------- <br />Elapsed Time: 0.992 CPU Time: 2.290 Executing actions 100 % done <br />$ amplxe-cl -report hotspots -group-by function <br />Using result path `/home/peter/problem_report/r012runss' Executing actions 75 % <br />Generating a report <br />Function Module CPU Time <br />---------- ---------- -------- <br />findPrimes primes.icc 2.290 Executing actions 100 % done </pre>
<p> <br />If collectSamplesMode is not specified explicitly (with either "stack" or "nostack"), users may see the error message below. The root-cause is - with "-collect-with runss", the user should specify collectSamplesMode explicitly (with "stack" or "nostack").</p>
<pre name="code" class="shell">$ amplxe-cl -collect-with runss-knob interval=20 -- ./primes.icc 
Determining primes from 1 - 100000 
Found 9592 primes 
Using result path `/home/peter/problem_report/r011runss' Executing actions 75 % 
Generating a report 
Summary ------- 
Elapsed Time: 0.978 Executing actions 100 % done 
$ amplxe-cl -report hotspots -group-by function 
Using result path `/home/peter/problem_report/r011runss' Executing actions 75 % 
Generating a report Result directory does not contain CPU samples. 
Executing actions 100 % done Error: Error 0x40000024 (Reporter error)</pre> ]]></description>
      <link>http://software.intel.com/en-us/articles/cannot-use-user-defined-hotspots-in-command-line/</link>
      <pubDate>Mon, 31 Oct 2011 08:00:00 -0700</pubDate>
      <comments>http://software.intel.com/en-us/articles/cannot-use-user-defined-hotspots-in-command-line/#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/cannot-use-user-defined-hotspots-in-command-line/</guid>
      <category>Intel® Parallel Amplifier Knowledge Base</category>
    </item>
    <item>
      <title>Intel® Parallel Studio 2011 SP1 Release Notes</title>
      <description><![CDATA[ <p>This page provides the current Installation Guide and Release Notes for the Intel® Parallel Studio 2011 SP1 product. All files are in PDF format - <a target="_blank" href="http://www.adobe.com/go/EN_US-H-GET-READER">Adobe Reader* </a>(or compatible) required.</p>
<p>To get product updates, log in to the <a href="https://registrationcenter.intel.com/">Intel® Software Development Products Registration Center</a></p>
<p>For questions or technical support, visit <a target="_blank" href="http://software.intel.com../../../../../sites/support/">Intel® Software Developer Support</a></p>
<hr />
<p><strong>Version 2011 SP1 Initial release</strong>, September 6, 2011: <a href="http://software.intel.com/file/38337">release_notes_studio.pdf</a></p>
<hr /> ]]></description>
      <link>http://software.intel.com/en-us/articles/intel-parallel-studio-2011-sp1-release-notes/</link>
      <pubDate>Wed, 31 Aug 2011 00:00:00 -0700</pubDate>
      <comments>http://software.intel.com/en-us/articles/intel-parallel-studio-2011-sp1-release-notes/#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/intel-parallel-studio-2011-sp1-release-notes/</guid>
      <category>Intel® Parallel Amplifier Knowledge Base</category>
      <category>Intel® Parallel Composer Knowledge Base</category>
      <category>Intel® Parallel Inspector Knowledge Base</category>
      <category>Intel® Parallel Advisor Knowledge Base</category>
    </item>
    <item>
      <title>Intel® Cilk™ Plus Support in Intel® Parallel Amplifier 2011</title>
      <description><![CDATA[ <a href="http://software.intel.com/en-us/articles/intel-cilk-plus/"><em>Intel® Cilk</em></a><a href="http://software.intel.com/en-us/articles/intel-cilk-plus/">™ Plus</a><em> is a simple and powerful abstraction for expressing parallelism. It is one of the </em><a href="http://software.intel.com/en-us/articles/intel-parallel-building-blocks/"><em>Intel® Parallel Building Blocks</em></a><em> and it is inc<a href="http://software.intel.com/en-us/articles/intel-cilk-plus/"><em></em></a>luded in </em><a href="http://software.intel.com/en-us/intel-parallel-composer/"><em>Intel® Parallel Composer 2011</em></a><em>, which is part of </em><a href="http://software.intel.com/en-us/intel-parallel-studio-home/"><em>Intel® Parallel Studio 2011</em></a><em>. In this initial introduction of Intel® Cilk™ Plus it is important to understand how the analysis features of Intel® Parallel Studio 2011 display results when Intel® Cilk™ Plus is used in your software. This article details the level of support provided by Intel® Parallel Amplifier 2011. Display of analysis results of software using Cilk™ Plus will become more informative in future releases. <br /></em><br /><strong>Overview<br /></strong>Intel® Parallel Amplifier 2011 will analyze Cilk Plus code and provide results. However, information about Cilk Plus code may not be represented in an intuitive way, and some features of Parallel Amplifier, such as source view, may not work properly on Cilk Plus code. Most of the limitations in how results are presented are due to the current implementation of the Cilk Plus abstractions which do not preserve a clean symbol mapping between the source code and the binary. Although you can expect this to be improved as the product matures, in its initial state this causes Cilk Plus functions to be referred to as &lt;unnamed-tag&gt;::operator(), which may be misleading. To assist you in interpreting your results, this article will walk you through some examples of how Cilk Plus code is currently represented in Parallel Amplifier 2011 analysis output. <br /><br /><strong>Hotspots Analysis and General Parallel Amplifier Functionality:<br /></strong>When it encounters cilk_for or cilk_spawn statements, the compiler creates lambda (on-the-fly anonymous) functions. In the case of a cilk_for it will encapsulate the body of the for loop in a lambda function so that it can be executed by multiple threads. A cilk_spawn statement results in a “spawn helper” lambda function that enables the passing of parameters to the spawned function. In either case, when Parallel Amplifier results contain samples from these lambda functions, they will be attributed to the proper module, but will be named &lt;unnamed-tag&gt;::operator() in the list of hotspots. Figure 1 shows the results of running Hotspots analysis on some simple Cilk Plus code that goes through a range of integers and counts prime numbers. The code contains one cilk_for call, wrapped in a function called parallel_count_primes. <br /><br /><a href="http://software.intel.com/en-us/articles/intel-cilk-plus/"><img src="http://software.intel.com/file/32808" alt="Figure1.PNG" title="Figure1.PNG" /></a><br /><em>Figure 1 - Hotspots results for simple cilk_for<br /></em><br /><br />The function called &lt;unnamed-tag&gt;::operator() in these results is the lambda function created by the compiler to execute the body of the cilk_for loop. The for loop is then implemented using a “divide and conquer” algorithm by the Cilk Plus runtime which calls the lambda function. This results in code which can be distributed efficiently across multiple workers, but also creates an unusual callstack.<br /><br />Figure 2 shows the partially expanded Top-Down Tree for the same result. The code that was run contained a cilk_for inside the function parallel_count_primes, and no other significant function calls. However, instead of seeing all the execution time grouped into the parallel_count_primes function, 39.7% of the execution time is attributed to an operator() within the parallel_count_primes tree, and 59.2% is attributed to a long chain of __cilkrts_cilk_for_32 function calls. This call chain demonstrates how the Cilk Plus run-time code is recursively expanding the work of the operator() function into chunks of work that get distributed to available threads. The Hotspots results show this expansion happening in the Cilk Plus runtime (cilkrts20) code. <br /><br /><a href="http://software.intel.com/en-us/articles/intel-cilk-plus/"><em><img src="http://software.intel.com/file/32809" alt="Figure2.PNG" title="Figure2.PNG" /></em></a><br /><em>Figure 2 – Top-down Tree Results for simple cilk_for</em><br /><br /><br />The recursive distribution of tasks is also evidenced by having many call stacks for an &lt;unnamed-tag&gt;::operator() function. Figure 3 shows the Call Stack pane for the simple count primes cilk_for code. The operator function for this code has 71 call stacks. 70 of the stacks are associated with the Cilk Plus run-time code (like the one shown).<br /><br /><a href="http://software.intel.com/en-us/articles/intel-cilk-plus/"><em><img src="http://software.intel.com/file/32810" alt="Figure3.PNG" title="Figure3.PNG" /></em></a><br /><em>Figure 3 – Call Stacks for &lt;unnamed-tag&gt;::operator() function for simple cilk_for <br /><br /><br /></em>Figure 4 shows the results of running Hotspots analysis on code that uses a cilk_spawn and cilk_sync in its algorithm to recursively find Fibonacci numbers. This time the &lt;unnamed-tag&gt;::operator() functions that show up in the results are the spawn helpers created each time a function is spawned. Like the previous example, these results show some Cilk Plus runtime activity as hotspots – such as the SwitchToFiber function (from kernel32.dll) that is called by the scheduler. The recursion inherent in the Fibonacci algorithm results in chains of fib-&gt; &lt;unnamed-tag&gt;::operator() calls in the call stacks. <br /><br />Because of the nature of the scheduling system, and the current partial support for Cilk Plus, Hotspots results for code using Cilk Plus may result in time being attributed to multiple hotspots. In this example, the fib, &lt;unnamed-tag&gt;::operator(), and background Cilk Plus functions in the list are all likely related to the same piece of code being executed – the spawned work of the fib function.<br /><br /><a href="http://software.intel.com/en-us/articles/intel-cilk-plus/"><em><img src="http://software.intel.com/file/32811" alt="Figure4.PNG" title="Figure4.PNG" /></em></a><br /><em>Figure 4 – Hotspots results for simple cilk_spawn and cilk_sync<br /></em><br /><br />When a project is analyzed that contains more than one Cilk Plus construct (such as 2 separate cilk_for loops), the time spent in the separate constructs may be grouped together into the same &lt;unnamed-tag&gt;::operator() function. Figure 5 shows the results of running Hotspots analysis on a project that contained two separate functions for finding primes using cilk_for, each called once. <br /><br /><a href="http://software.intel.com/en-us/articles/intel-cilk-plus/"><em><img src="http://software.intel.com/file/32812" alt="Figure5.PNG" title="Figure5.PNG" /></em></a><br /><em>Figure 5 – Hotspots results for code containing 2 cilk_for loops, each executed<br /><br /></em>The two cilk_for loops were wrapped in two functions called parallel_count_primes and parallel_counter_2. However the hotspots results and call stacks do not contain any references to the parallel_counter_2 function – all significant cilk_for execution time was attributed to the &lt;unnamed-tag&gt;::operator() function with the parallel_count_primes function as its caller. <br /><br />Finally, because time may not be correctly attributed in code that runs multiple Cilk Plus constructs, double-clicking the &lt;unnamed-tag&gt;::operator functions or their call stacks may not open at the right place in Source View mode. For example, double-clicking on the &lt;unnamed-tag&gt;::operator function Figure 5 always results in viewing the parallel_count_primes function in the Source View, and never shows the parallel_counter_2 function. It may not be possible to separate which time was spent in which construct. <br /><br />Although this is not guaranteed, for code containing only one Cilk Plus construct, double-clicking the &lt;unnamed-tag&gt;::operator will usually open the correct source code file and pinpoint the correct function. Double-clicking on a __cilkrts call stack for the &lt;unnamed-tag&gt;::operator may take you to the Cilk Plus run-time instead of the correct place in user code. <br /><br /><strong>Concurrency Analysis:<br /></strong>Concurrency Analysis represents Cilk Plus constructs in the same way as Hotspots Analysis: code will be grouped into one or more &lt;unnamed-tag&gt;::operator() functions, may have callstacks showing Cilk Plus run-time code, may not attribute time within Cilk Plus constructs properly, and may not open source view properly for Cilk Plus code. The concurrency values (poor, OK, ideal, etc) for Cilk Plus code have generally been correct in limited internal testing, but are not guaranteed to be. Figure 6 shows the results of Concurrency Analysis on the simple find primes code with one cilk_for (the same code used in Figures 1 and 2). When the code ran, 3 software threads executed – a main thread, and two worker threads created by the Cilk Plus runtime. The results correctly show that all three of these threads executed tasks created from the cilk_for construct. In general, the main thread (labeled wmainCRTStartup) will bind to the Cilk Plus runtime and begin running the scheduling code. It may also complete some of the tasks. Worker threads are created by the Cilk Plus runtime (executed by the main thread) and will only be executing available spawned work.<br /><br /><a href="http://software.intel.com/en-us/articles/intel-cilk-plus/"><em><img src="http://software.intel.com/file/32813" alt="Figure6.PNG" title="Figure6.PNG" /></em></a><br /><em>Figure 6- Concurrency results in thread view for simple cilk_for code<br /><br /><br /></em><strong>Locks and Waits Analysis:<br /></strong>Locks and Waits Analysis shows the synchronization objects in an application and how long the processing cores spent waiting on each, as well as how utilized the cores were during the wait. For a Cilk Plus program, several synchronization objects that are part of the run-time library may show up in the results. These synchronization objects will be labeled as being part of Cilk Plus (under Sync Object Type). Figure 7 shows an example of Locks and Waits analysis results for the simple count primes program with one cilk_for loop. This code also contains one Cilk Plus hyper-object: a reducer used to hold the count of primes found. There are three synchronization objects from within Cilk Plus that are identified as having significant waiting time: one Intel Cilk Plus Scheduler object, one Intel Cilk Plus Completion Semaphore object, and one Intel Cilk Plus Initialization object.<br /><br />At this point, wait times and utilizations are not guaranteed to be correct for the Cilk Plus constructs and run-time objects. <br /><br />The Wait Times and utilizations are not guaranteed to be correct for the Cilk Plus constructs and run-time objects, but might give an idea of where overhead is occurring. Where wait times with poor utilization are seen in objects in the Cilk Scheduler, this may be an indication that there is not enough work to keep the Cilk Worker Threads busy (increase problem size), too much scheduling overhead (increase task or grain size, change algorithm), or another issue. Double-clicking the synchronization objects in the Cilk Scheduler or the Cilk User Thread may not lead to the proper source code line in Source Code View.<br /><br />Wait times on the three objects in this example are generally not an indication of an issue, regardless of the utilization time shown. Waiting on the Intel Cilk Plus Scheduler object is by design when only one Cilk Plus application is being run and the default number of threads is being used. It occurs because the Cilk Plus run-time creates N worker threads, where N is by default the number of processing cores available, but for a single running application will usually only have N-1 threads executing tasks, plus the main (user) thread. Waiting occurs on the Nth thread that was created but is not doing work. Some waiting on the Cilk Plus Completion Semaphore is also expected – this occurs when the main thread completes its last task before one or more Cilk Plus worker threads complete their last tasks. Once all the Cilk Plus worker threads are done, the main thread will resume and return to the main application. A small wait on the Cilk Plus Initialization object should occur normally as part of the start-up of the run-time.<br /><br /><a href="http://software.intel.com/en-us/articles/intel-cilk-plus/"><em><img src="http://software.intel.com/file/32814" alt="Figure7.PNG" title="Figure7.PNG" /></em></a><br /><em>Figure 7- Concurrency results in thread view for simple cilk_for code<br /></em><br /><br /><strong>Summary and Where to go for Help <br /></strong>As mentioned in the introduction, Parallel Amplifier should analyze projects containing Cilk Plus code without crashing. The information above should give some guidance as to how to interpret the results of analysis on Cilk Plus constructs. For additional help, please post a question on the <a href="http://software.intel.com/en-us/forums/intel-parallel-studio/">Intel Parallel Studio forum</a>.<br /> ]]></description>
      <link>http://software.intel.com/en-us/articles/intel-cilk-plus-support-in-intel-parallel-amplifier-2011/</link>
      <pubDate>Thu, 09 Dec 2010 21:00:00 -0800</pubDate>
      <comments>http://software.intel.com/en-us/articles/intel-cilk-plus-support-in-intel-parallel-amplifier-2011/#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/intel-cilk-plus-support-in-intel-parallel-amplifier-2011/</guid>
      <category>Intel® Parallel Composer</category>
      <category>Intel® Parallel Amplifier</category>
      <category>Intel® Parallel Amplifier Knowledge Base</category>
      <category>Intel® Parallel Composer Knowledge Base</category>
    </item>
    <item>
      <title>Intel® Software Development Products Technical Presentations</title>
      <description><![CDATA[ Spend a few minutes to jump start your product experience. Please join us for one of the following Intel® Software Development Products technical presentations. These one hour presentations give you a chance to view a short live presentation or demo and then ask questions to our support engineers either about the presentation/demo or about anything related to using the product presented.<br /><br />Want to learn about upcoming technical presentations or when recordings are posted? You can subscribe via an <a href="http://software.intel.com/en-us/blogs/tag/SWProdTechPres/feed/">RSS feed </a>or via <a href="http://feedburner.google.com/fb/a/mailverify?uri=ISNSWTechPres&amp;loc=en_US">email</a>.<br /><br />
<div>
<div>
<div><span class="sectionHeading">Upcoming Technical Presentations<br /></span><br /><br /><strong>Future-Proof Your Application's Performance With Vectorization<br /></strong><br />You’ve heard of using parallelism to run your application across multiple cores. Vectorization is another level of parallelism that occurs within 1 CPU core – it enables one instruction to operate on multiple pieces of data at the same time. Vectorization is an important contributor to performance on current x86 processors, including the 2nd Generation Intel® Core™ Processor Family, and is vital for performance on future processors such as the Intel® Many Integrated Core (Intel® MIC) architecture. Understanding how to vectorize your applications now will allow much easier migration to future processor architectures. Scientific, engineering, and multimedia applications are all potential candidates for this technology. <br /><br />This presentation is for C, C++, and Fortran developers, and will help you get started understanding and evaluating vectorization using new technologies such as Intel® Cilk Plus, pragma SIMD and the Intel Compiler’s Guided Auto Parallelization report. We will also discuss the pros and cons of various techniques and usages. <br /><br />As a special benefit for live attendees, you will also have the opportunity to request a follow-up with an Intel vectorization expert! Don’t miss this opportunity to position your application for the future! <br /><br /><strong><a href="http://software.intel.com/en-us/articles/future-proof-your-applications-performance-with-vectorization/">First session recording and Q&amp;A available. <br /></a></strong><strong><br /></strong><br /><span class="sectionHeading">Past Technical Presentations<br /></span><br /><strong>Analysis of hybrid applications with the Intel Cluster Studio XE 2012<br /></strong><br />With the launch of Intel® Cluster Studio XE 2012, Intel enhanced the premium software development tools package for clusters with the inclusion of MPI support in Intel® Parallel Studio XE, and added new features for better scalability and improved performance. This session will introduce you to all MPI components of the new Intel® Cluster Studio XE 2012. You’ll learn how to use the new and more scalable startup mechanism to run MPI applications well up to 90000 cores, you’ll take a dive into benchmark data, and the improvements and details of the mpitune tool, and you’ll see, in an interactive demo, key elements and new scalability features of Intel® Trace Analyzer and Collector. Finally, you’ll be shown how to enable the new MPI support in the Intel® VTune™ Amplifier XE and Intel® Inspector XE tools.<br /><br /><strong><a href="http://software.intel.com/file/40265">Recording available </a>(Windows viewable only)<br />Presented: </strong>Wednesday, December 7, 2011 9:00 AM - 10:00 AM PST<br /><a href="http://software.intel.com/file/40266"><strong>Presentation slides (PDF format)</strong></a><br /><br /><strong>Using Intel(R) VTune(TM) Amplifier XE to Tune Software on Intel(R) Microarchitecture Codename Sandy Bridge, Part 2: Common Issues &amp; Tuning Suggestions<br /><br /></strong>This webinar is the second part of our 2-part series on Using Intel(R) VTune(TM) Amplifier XE to Tune Software on Intel(R) Microarchitecture Codename Sandy Bridge. We recommend you watch part 1 first unless you are already familiar with the VTune Amplifier XE Sandy Bridge interface and the pipeline slots methodology. This technical presentation will discuss common performance issues, how to measure their impact on Sandy Bridge, and specific suggestions for resolving each.<br /><br /><strong>Presented: </strong>Wednesday, November 9, 2011 9:00 AM - 10:00 AM PST<br /><strong><a href="http://software.intel.com/file/39743">Recording available (Windows viewable only)</a><br /><a href="http://software.intel.com/file/37087/">Presentation slides (PDF format)<br /></a><br />Using Intel(R) VTune(TM) Amplifier XE to Tune Software on Intel(R) Microarchitecture Codename Sandy Bridge, Part 1: Methodology &amp; Interface. <br /><br />Presented: </strong>Tuesday, November 8, 2011 9:00 AM - 10:00 AM PST<br /><a href="http://software.intel.com/file/39729"><strong>Recording Available (Windows viewable only)<br /></strong></a><a href="http://software.intel.com/file/37087/"><strong>Presentation slides (PDF format)</strong></a><br /><br />This technical presentation is part of a 2-part series on Using Intel(R) VTune(TM) Amplifier XE to Tune Software on Intel(R) Microarchitecture Codename Sandy Bridge. Part 1 will discuss the VTune Amplifier XE and its new features specifically for performance analysis on Sandy Bridge. It will also detail our general performance tuning methodology, based on hotspots. The final section will cover the Sandy Bridge microarchitectural details you need to understand to get the most from our Sandy Bridge tuning guide and interface.<br /><br /><strong>Using Intel(R) Inspector XE with Fortran Applications<br /><br /></strong>This technical presentation will present an overview of the powerful correctness and security checking features of Intel® Inspector XE. There will be a focus on using Intel Inspector XE on Fortran applications. The presentation will include example problems detected by the memory, threading, and static security analysis tools as well as some possible solutions. For more details, please check the following blog post.<br /><br />Presented: Wednesday, Aug 17, 2011, 9:00 AM - 10:00 AM PDT<br /><strong><a target="_blank" href="http://software.intel.com/file/39553">Recording Available (Windows viewable only)</a><br /><a href="http://software.intel.com/file/39552">Presentation slides (PDF format)<br /></a><br />Modeling parallelism with Intel® Parallel Advisor<br /><br /></strong>An application written in a sequential language like C++ can be understood in two ways. It can be understood as an exact specific of how a program must execute, or it can be understood as a specification of the kinds of computations that must be performed. In the Parallel Advisor, we exploit the second interpretation by introducing a modeling language that can be embedded into your sequential application. This modeling language allows you to precisely specify where and how the sequential execution of your application is over-constrained and what flexibility you are willing to utilize to harness parallel execution. This talk will describe the modeling language, show the benefits of parallel modeling over parallel execution, and illustrate the correspondence of the parallel modeling language to common idioms available in Intel® Threading Building Blocks and Intel® Cilk™ Plus.<br /><br />Presented: Thursday, July 21, 2011, 9:00 AM - 10:00 AM PDT<br /><a href="http://software.intel.com/file/39497"><strong>Recording Available (Windows viewable only)</strong><br /></a><strong><a href="http://software.intel.com/file/39500">Presentation slides (Powerpoint format)</a><br /></strong><br /><strong>Intel® Parallel Advisor 2011 Shows Its Stuff on Duplo <br /><br /></strong>Intel® Parallel Advisor 2011 provides the information and the tools needed by any C/C++ programmer to add safe and effective parallelism to their programs. This is demonstrated by applying Advisor to Duplo, a serial, open-source application for finding duplicate blocks of code in a set of source files. Parallel Advisor is a component of Intel® Parallel Studio 2011 and is a free download for Intel® Parallel Studio XE. It is integrated into Microsoft Visual Studio*.<br /><br />In this presentation, you will learn how to: <br />• Find the places in Duplo where parallelism can be usefully added <br />• Find and replace the parts of Duplo that prevent parallelism <br />• Test the revised version of Duplo for parallel correctness and performance<br />• While keeping Duplo serial through these steps!<br />• Implement the parallelism using Intel® Cilk™ Plus<br /><br />Along the way we discover an unexpected opportunity to improve the serial performance by 30%. We also encounter two ordering dependencies that almost derail parallelization, until Cilk’s hyperobjects come to the rescue. Finally, we see how closely Advisor’s parallel performance estimates match the actual speed-ups of the parallel version of Duplo.<br /><br />Presented: Thursday, July 21, 2011, 9:00 AM - 10:00 AM PDT<br /><strong><a href="http://software.intel.com/file/39498">Recording Available (Windows viewable only)</a><br /><a href="http://software.intel.com/file/39501">Presentation slides (Powerpoint format)</a><br /><br />Choosing Where To Introduce Parallelism (using Intel® Parallel Advisor 2011)<br /><br /><br /></strong>The Intel® Parallel Advisor 2011 has a feature that surveys your running program and shows you cumulative time spent within functions and loops. Learn how to combine this information with your knowledge of the program to decide where to invest your time adding parallelism. The presenter Bevin Brett will describe how you should consider both program structure and data structure as you make this decision.<br /><br />Presented: Wednesday, March 23, 2011, 9:00 AM - 10:00 AM PDT<br /><strong><a href="http://software.intel.com/file/39504">Recording Available (Windows viewable only)</a></strong><br /><strong><a href="http://software.intel.com/file/39505">Presentation slides (Powerpoint format)</a></strong><br /><br /><strong>Topic: Getting More out of your CPU – Using Intel® C++ Composer XE to Maximize Code Vectorization and Improve Application Performance<br /><br /></strong>SIMD (Single Input – Multiple Data) Instructions have long been an important performance feature of Intel (and Intel-compatible) CPUs. Now, with the introduction of the Sandy Bridge processor family and its support for the new Intel® Advanced Vector Extensions (Intel® AVX) instructions, taking advantage of these instructions remains one of the best ways to optimize for performance. In this technical presentation, we will show how you can use the state of the art auto-vectorizer provided with Intel® C++ Composer XE to generate these instructions for you automatically from high level C++, and how you as the programmer can encourage Composer XE to generate efficient code even when using traditionally vectorization-unfriendly codes like arrays of structures or mixed datatypes.<br /><br />Presented: March 15, 2011<br /><br /><strong><a href="http://software.intel.com/file/34861">Recording Available  (Windows viewable only)<br /></a><br /><a href="http://software.intel.com/file/34773">Presentation slides (Powerpoint format)<br /></a><br /><a href="http://software.intel.com/en-us/articles/kernel-template-library/">Kernel Template Library Code Examples<br /></a><br /><br /><br />Topic: Intel® ArBB Code Tips II – A compilation of best practices and useful hints<br /><br /><br /></strong>Thanks to our user base and increasing community we extend the first Intel® Array Building Blocks (Intel® ArBB) "Code Tips" webinar. In this webinar, we discuss best practices including:<br /><br />• How to develop in immediate mode and later "toggle" to production/JIT code <br />• How to get initial data into a container (memory mapping and binding) <br />• How to update the values of a containers according to an index<br /><br />We share code examples, background information, and insight to our design decisions. This webinar is great for developers who want to have a fresh start after some initial steps, or people who are preparing to have a look at Intel ArBB. It is also a great chance to ask questions of Intel ArBB engineers and experts during and after this webinar.<br /><br />Presented: February 22, 2011</div>
<div>
<p><br /><strong><a href="http://software.intel.com/file/34411">Recording available</a></strong></p>
<p><strong><a href="http://software.intel.com/file/34410">Presentation available</a><br /><br /><br /><br /></strong><a name="Quickly-Manipulate-Data"></a><strong>Topic: Intel® Parallel Building Blocks: Quickly Manipulate Data in Parallel Using Intel® Cilk™ Plus Array Notation/Elemental Functions<br /><br /></strong>As multicore systems become prevalent on desktops, servers and laptop systems, new performance leaps will come as the industry adopts parallel programming techniques. However, many parallel environments consist of confusing, complex and error-prone rules and constructs. The Intel Cilk Plus language, built on the Cilk technology developed at M.I.T. over the past two decades, is designed to provide a simple, well-structured model that makes development, verification and analysis easy. Because Intel Cilk Plus is an extension to C and C++, programmers typically do not need to restructure programs significantly in order to add parallelism. <br /><br />This technical presentation will focus on examples showing the benefits of using Cilk Plus array notation and elemental functions to define operations that can be run of multiple data elements simultaneously using the underlying Intel® Streaming SIMD Extensions provided by Intel® CPUs.<br /><br />Presented: February 1, 2011<br /><br /><br /><strong><a href="http://software.intel.com/file/33996">Recording available</a> <br /><br /><a href="http://software.intel.com/file/33943">Presentation available<br /><br /></a><br /><br />Topic: Intel® Parallel Building Blocks: Quickly Write Parallel Tasks Using Intel® Cilk™ Plus Keywords and Reducers<br /><br /></strong>As multicore systems become prevalent on desktops, servers and laptop systems, new performance leaps will come as the industry adopts parallel programming techniques. However, many parallel environments consist of confusing, complex and error-prone rules and constructs. The Intel Cilk Plus language, built on the Cilk technology developed at M.I.T. over the past two decades, is designed to provide a simple, well-structured model that makes development, verification and analysis easy. Because Intel Cilk Plus is an extension to C and C++, programmers typically do not need to restructure programs significantly in order to add parallelism. <br /><br />This technical presentation will focus on examples showing the benefits of using the Cilk Plus keywords cilk_spawn, cilk_for and cilk_sync along with Cilk Plus reducers such as reducer_opadd to define tasks that can be run on different cpu cores in parallel.<br /><br />Presented: January 18, 2011<br /><br /><strong><a href="http://software.intel.com/file/33650">Recording Available<br /></a><br /><a href="http://software.intel.com/en-us/articles/quickly-write-parallel-tasks-using-intel-cilk-plus-keywords-and-reducers-technical-presentation-questions-and-answers">Q&amp;A and Presentation Available<br /><br /></a><br />Topic: What's New with Intel® Fortran Composer XE 2011?<br /><br /></strong>Aside from the obvious name change, Intel® Fortran Composer XE 2011 launched in early November brings many new features to Intel's Fortran implementation. In this technical presentation, Steve Lionel and Ron Green from Intel's Fortran Support team discuss the new Fortran features added in Intel Fortran, including our exciting new Coarray Fortran implementation. This webinar also gives Intel Fortran users a chance to ask questions about the product, the name change, and directions for Intel Fortran going forward. <br /><br />Presented: December 14, 2010<br /><br /><a href="http://software.intel.com/file/32861"><strong>Recording Available<br /><br /></strong></a><br /><strong>Topic: Intel ArBB Code Tips<br /><br /></strong>This webinar is an intermediate-level talk for users who have had some experience with Intel® Array Building Blocks. But new users may also benefit from it by getting a jump start on programming Intel® ArBB. During this one-hour presentation, we are sharing many code tips to cover the following topics:<br /><br />1. How to express parallelism using container operations and the arbb::map() function<br />2. User defined types and functions<br />3. How to program for performance<br />4. Pitfalls and misuses to avoid<br /><br />Presented: December 9, 2010<br /><br /><strong><a href="http://software.intel.com/en-us/articles/arbb-webinar-dec-9-2010/">Recording and Presentation Slides Available<br /></a></strong><br /><br /><strong>Topic: Accelerate your multimedia and data processing application with the Intel® Integrated Performance Primitives (Intel® IPP) 7.0<br /></strong><br />The Intel® IPP library is a collection of highly optimized software functions for use with a wide range of applications, including digital media, signal processing and data-processing applications and is included as a component within the Intel Parallel Studio developer's toolkit. <br /><br />This webinar will cover key new features and changes that are part of the IPP 7.0 release, and provide a review of the drop-in high-level data compression libraries now included with the Intel IPP library: ipp_zlib, ipp_bzip2, ipp_gzip and ipp_lzopack. New features in the Intel® Intel® IPP 7.0 library, include: <br /><br />• Data Compression Library support <br />• JPEG-XR support and imaging performance improvements <br />• Optimizations for the 256-bit AVX SIMD instruction set <br />• Intel® AES-NI (cryptography) optimizations<br /><br />Presenter: Paul Fischer<br />Presented: November 18, 2010</p>
<p><a href="http://software.intel.com/en-us/articles/questions-and-answers-from-the-11-18-2010-ipp-webinar/"><strong>Recording and Q&amp;A Available</strong></a><br /><br /><br /><strong>Topic:</strong> <strong>Super Charge Applications with Intel® Integrated Performance Primitives – A Component of Intel® Parallel Studio 2011<br /><br /></strong>Intel® Integrated Performance Primitives (Intel® IPP) is an extensive library of multicore-ready, highly optimized software functions for digital media and data-processing applications and comes with Intel Parallel Studio 2011 and Intel® Parallel Composer 2011. We will show how to set up an application to use Intel IPP from within Intel Parallel Studio, what kind functions the library has to offer and an example of the performance benefits from using the library.<br /><br />Presenter: Walter Shands<br />Presented: October 26, 2010<br /><br /><a href="http://software.intel.com/en-us/articles/questions-and-answers-from-the-intel-integrated-performance-primitives-webinar-on-october-26-2010/"><strong>Recording and Q&amp;A Available</strong></a></p>
<p><br /><br /><strong>Topic: Intel® Array Building Blocks Technical Presentation: Introduction and Q&amp;A<br /><br /></strong>Intel® Array Building Blocks provides a generalized data parallel programming solution that frees application developers from dependencies on particular hardware architectures. It offers an API that allows parallel algorithms to be expressed at a high level. Its dynamic compiler and runtime produce scalable, portable and deterministic parallel implementations from the single high-level source. This technical presentation is an introduction to Intel ArBB. We will discuss the main features of Intel ArBB, and walk through some sample code to demonstrate the benefits such as forward-scaling, safety-by-design, and ease-of-use. There will be a Q&amp;A session at the end of the presentation to answer any questions you have for Intel ArBB. <br /><br />Presenters: Noah Clemons<br />Presented: October 14, 2010<br /><br /><a href="http://software.intel.com/file/31293"><strong>Recording available<br /></strong><br /></a><br /><br /><strong>Topic: Adding Parallelism with Intel® Parallel Advisor 2011: No Parallelism Experience Required<br /><br /></strong>Intel® Parallel Advisor 2011 provides the information and the tools needed by any C/C++ programmer to add safe and effective parallelism to their programs. Parallel Advisor is a component of Intel® Parallel Studio 2011 and is integrated into Microsoft Visual Studio*.<br /><br />In this presentation, you will learn how to: <br />• Find the places in the program where parallelism can be usefully added <br />• Find and replace the parts of the program that prevent parallelism <br />• Test the revised program for parallel correctness and performance<br />• Keep the program serial through these steps<br />• Add parallelism to code samples / examples<br /><br />Presenter: Mark Davis<br />Presented: October 12, 2010<br /><br /><a href="http://software.intel.com/file/31242"><strong>Recording available<br /></strong><br /></a><br /><br /><strong>Topic: Introduction to Intel® Cilk™ Plus<br /></strong><br />Intel® Cilk Plus, one of the Intel® Parallel Building Blocks which includes Intel® Threading Building Blocks and Intel® Array Building Blocks, provides C/C++ language extensions to implement parallelism in your application simply and efficiently. This technical talk will cover this new syntax, supported by the C++ compiler in Intel® Parallel Composer, and how it provides an easy way to use task and vector parallelism to take full advantage of multiple cores and the SIMD-compute engine on CPUs.<br /><br />Presenter: Brandon Hewitt<br />Presented: September 28, 2010<br /><br /><a href="http://software.intel.com/file/30848"><strong>Recording available<br /></strong></a><br /><br /><br /><strong>Topic: Introducing Intel® Parallel Building Blocks<br /><br /></strong>This technical presentation will introduce three methods for adding parallelization to your serial application: Intel® Threading Building Blocks, Intel® Cilk™ Plus, and Intel® Array Building Blocks. The methods will be compared showing where each will be most beneficial and to what types of applications. In addition to the presentation there will be a brief demo followed by time for Q&amp;A on these threading methods or any other questions you might have about using Intel® Parallel Studio.<br /><br />Presenter: Noah Clemons<br />Presented: September 23, 2010<br /><br /><a href="http://software.intel.com/file/31246"><strong>Recording available<br /></strong></a><br /><br /><strong>Topic: Introducing Intel Parallel Inspector<br /></strong><br />Intel® Parallel Inspector is a serial and multithreading error checking analysis tool for Microsoft Visual Studio* C/C++ developers. Inspector detects challenging memory leaks and corruption errors as well as threading data races and deadlock errors. This comprehensive developer productivity tool pinpoints errors and provides guidance to help ensure application reliability and quality. This technical presentation will include an overview of the tool, live demo, and Q&amp;A session. You are welcome to ask questions about any part of Intel® Parallel Studio during the Q&amp;A session.<br /><br />Presenter: Jackson Marusarz<br />Presented: August 19, 2010<br /><br /><a href="http://software.intel.com/file/29789"><strong>Recording available<br /></strong></a><br /><br />These webinars are part of a program from the Intel® Software Development Products technical consulting team to deliver technical presentations to customers.</p>
</div>
</div>
</div> ]]></description>
      <link>http://software.intel.com/en-us/articles/intel-software-development-products-technical-presentations/</link>
      <pubDate>Thu, 02 Dec 2010 00:00:00 -0800</pubDate>
      <comments>http://software.intel.com/en-us/articles/intel-software-development-products-technical-presentations/#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/intel-software-development-products-technical-presentations/</guid>
      <category>Parallel Programming</category>
      <category>Intel® C++ Compiler for Linux* Knowledge Base</category>
      <category>Intel® C++ Compiler for Mac OS X* Knowledge Base</category>
      <category>Intel® C++ Compiler for Windows* Knowledge Base</category>
      <category>Intel® Fortran Compiler for Linux* Knowledge Base</category>
      <category>Intel® Fortran Compiler for Mac OS X* Knowledge Base</category>
      <category>Intel® Integrated Performance Primitives Knowledge Base</category>
      <category>Intel® Parallel Amplifier Knowledge Base</category>
      <category>Intel® Parallel Composer Knowledge Base</category>
      <category>Intel® Parallel Inspector Knowledge Base</category>
      <category>Intel® Visual Fortran Compiler for Windows* Knowledge Base</category>
      <category>Intel® Parallel Advisor Knowledge Base</category>
    </item>
    <item>
      <title>&amp;#34;Attach debugger?&amp;#34; Message</title>
      <description><![CDATA[ <strong>Problem :<br /></strong>After running an VTune™ Amplifier XE 2011 data analysis from within Microsoft* Visual Studio*, you may see a dialog box with the caption "<strong>Attach debugger?</strong>" and the message<br /><br /><strong>&lt;type 'exceptions.ImportError'&gt; No module name pythonhelpers1 &lt;string&gt;(1): &lt;module&gt;</strong><br /><br /><img src="http://software.intel.com/file/32221" alt="AXE+python+error.PNG" title="AXE+python+error.PNG" /><br /><br />Pressing Yes or No usually results in the display of the collected data.  Sometimes the dialog is NOT displayed and in those cases the data may not be displayed.<br /><br />
<div id="art_pre_template"><b> </b><br /><b>Environment : </b><br />Microsoft* Windows*<br /><br /><b>Root Cause : </b><br />There is a compatibility issue with the current releases of Intel® Parallel Amplifier 2011 and VTune Amplifier XE.  Both tools use different versions of a DLL and the version used by Parallel Amplifier is not compatible with VTune Amplifier XE.  If that version is loaded by Visual Studio, VTune Amplifier XE cannot load the newer version of the DLL, which is required for correct operation.<br /><br /><b>Resolution : </b><br />Until a fix is available, closing down Visual Studio*, restarting it, and using VTune Amplifier XE prior to any other activity should resolve the issue for the duration of the Visual Studio invocation.  The problem arises when Parallel Amplifier functionality is used prior to VTune Amplifier XE in the same Visual Studio session.<br /><br /></div> ]]></description>
      <link>http://software.intel.com/en-us/articles/attach-debugger-message/</link>
      <pubDate>Sun, 21 Nov 2010 21:00:00 -0800</pubDate>
      <comments>http://software.intel.com/en-us/articles/attach-debugger-message/#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/attach-debugger-message/</guid>
      <category>Intel® Parallel Amplifier Knowledge Base</category>
      <category>Intel® VTune™ Amplifier XE Knowledge Base</category>
    </item>
    <item>
      <title>Intel® Parallel Studio – supported versions</title>
      <description><![CDATA[ <p><b>Intel® Parallel Studio - supported versions</b></p>
<p>Interactive support for Intel® Parallel Studio is provided via <a href="http://software.intel.com/en-us/articles/performance-tools-for-software-developers-intel-premier-support">Intel® Premier Support</a> or the <a href="http://software.intel.com/en-us/forums/intel-parallel-studio/">Intel® Parallel Studio User Forum</a>.  Interactive support for older versions are only supported via the Intel® Parallel Studio User Forum.</p>
<p>If you have any questions on the above policy, please contact Intel® Premier Support or post on the Intel Parallel Studio User Forum.  Refer to <a href="http://software.intel.com/en-us/articles/performance-tools-for-software-developers-intel-premier-support/">Intel® Premier support frequently asked questions</a> link for useful information.</p>
<p>The following Studio versions are currently supported:</p>
<p>• <strong>Current release: <br /></strong>   -  Intel® Parallel Studio 2011 for Windows* Service Pack 1 (SP1) <br /><br />• Previous releases:<br />   -  Intel® Parallel Studio 2011 for Windows*   (Initial Release and above)<br /> </p> ]]></description>
      <link>http://software.intel.com/en-us/articles/intel-parallel-studio-supportedversions-1/</link>
      <pubDate>Wed, 22 Sep 2010 00:00:00 -0700</pubDate>
      <comments>http://software.intel.com/en-us/articles/intel-parallel-studio-supportedversions-1/#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/intel-parallel-studio-supportedversions-1/</guid>
      <category>Intel® C++ Compiler for Windows* Knowledge Base</category>
      <category>Intel® Parallel Amplifier Knowledge Base</category>
      <category>Intel® Parallel Composer Knowledge Base</category>
      <category>Intel® Parallel Inspector Knowledge Base</category>
      <category>Intel® Parallel Advisor Knowledge Base</category>
    </item>
    <item>
      <title>Mismatched Call Stacks between Bottom-up tree and Call Stack pane</title>
      <description><![CDATA[ <div id="art_pre_template">
<p>When a user adopts stack-sampling collector analysis such as Hotspots, Concurrency or Locks Waits, Call Stack info will be displayed in Bottom-up report. (OS timer only)</p>
<p>Call Stack tree lists all call sequences (stacks) which called hot functions. Call Stacks from different threads are aggregated together for view type "Function - Call Stack", or Call Stacks can be displayed respectively for different thread for view type "Thread - Function - Call Stack".</p>
<p>Here is the Hotspot result after running a single thread application. Hot function foo_data_collected() has only one Stack wmain() in Bottom-up tree, but Call Stack pane indicates there are two Stacks.</p>
<p><br /><img height="274" width="673" src="http://software.intel.com/file/30425" alt="callstack1.bmp" title="callstack1.bmp" /><br /><br />First stack is wmain() from "test_itt_api.cpp:32", let's see what second stack is?</p>
<p><br /><img height="266" width="673" src="http://software.intel.com/file/30426" alt="callstack2.bmp" title="callstack2.bmp" /><br /><br />Second stack is wmain() from "test_itt_api.cpp:24". That means wmain() calls hot function foo_data_collected() from different source lines in same function. The bottom-up tree shown in the Bottom-up pane aggregates these stacks in one line. But the Call Stack pane shows each as a separate stack</p>
</div> ]]></description>
      <link>http://software.intel.com/en-us/articles/mismatched-call-stacks-between-bottom-up-tree-and-call-stack-pane/</link>
      <pubDate>Thu, 16 Sep 2010 05:00:00 -0700</pubDate>
      <comments>http://software.intel.com/en-us/articles/mismatched-call-stacks-between-bottom-up-tree-and-call-stack-pane/#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/mismatched-call-stacks-between-bottom-up-tree-and-call-stack-pane/</guid>
      <category>Intel® Parallel Amplifier Knowledge Base</category>
      <category>Intel® VTune™ Amplifier XE Knowledge Base</category>
    </item>
    <item>
      <title>How to change the Parallel Studio version integrated into Visual Studio</title>
      <description><![CDATA[ <div id="art_pre_template">
<p><b>Problem : </b><br />Only one version of Intel® Parallel Studio can be integrated with any one version of Microsoft Visual Studio* at a time. Therefore, if you have Intel Parallel Studio installed on your system and then install a different version along side of it, the newly installed version will be integrated into Visual Studio in place of the previously installed version - this means you will see the newly installed Parallel Studio toolbars, menu items, etc. </p>
<p>You can control which version of Intel Parallel Studio you use with a particular Visual Studio by performing the steps outlined below.<br /><br /><b>Environment: </b><br />Windows systems with Microsoft Visual Studio 2005, 2008, and/or 2010 installed along with multiple versions of Intel Parallel Studio.<br /><br /><b>Root Cause: </b><br />Limit of one Parallel Studio integrated with a version of Visual Studio.<br /><br /><b>Resolution: </b><br />You will need to change the version of Parallel Studio that is integrated with a particular version of Visual Studio.  This will need to be done for each component of the Parallel Studio that you have installed. </p>
<ol type="1">
<li>Begin by removing the integration from the version that is currently integrated. </li>
</ol>
<p>For  <i>Intel Parallel Amplifier</i> or <i>Intel Parallel Inspector</i>, start by opening the Command Prompt window for the version of Parallel Studio you wish to disable. For example:</p>
<table cellpadding="0" cellspacing="0" border="1">
<tbody>
<tr>
<td width="498" valign="top">
<p>To open an Intel Parallel Studio 2011 command prompt in the Visual Studio 2005 mode:</p>
<p><b>Start &gt; All Programs &gt; Intel Parallel Studio 2011 &gt; Command Prompt &gt; IA 32 Visual Studio 2005 mode</b>. </p>
<p>To open an Intel Parallel Studio command prompt in the Visual Studio 2008 mode:  </p>
<p><b>Start &gt; All Programs &gt; Intel Parallel Studio &gt; Command Prompt &gt; IA 32 Visual Studio 2008 mode</b>. </p>
</td>
</tr>
</tbody>
</table>
<p> </p>
<p>Now invoke the appropriate script to disable the integration:</p>
<table width="559" cellpadding="0" cellspacing="0" border="1">
<tbody>
<tr>
<td width="114" valign="top">
<p><b>Tool</b></p>
</td>
<td width="148" valign="top">
<p align="center"><b>Visual Studio 2005</b></p>
</td>
<td width="148" valign="top">
<p align="center"><b>Visual Studio 2008</b></p>
</td>
<td width="148" valign="top">
<p align="center"><b>Visual Studio 2010</b></p>
</td>
</tr>
<tr>
<td width="114" valign="top">
<p>Intel Parallel Amplifier</p>
</td>
<td width="148" valign="top">
<p>ampl-vsreg disable vs2005</p>
</td>
<td width="148" valign="top">
<p>ampl-vsreg disable vs2008</p>
</td>
<td width="148" valign="top">
<p>ampl-vsreg disable vs2010</p>
</td>
</tr>
<tr>
<td width="114" valign="top">
<p>Intel Parallel Amplifier 2011</p>
</td>
<td width="148" valign="top">
<p>ampl-vsreg --disable 2005</p>
</td>
<td width="148" valign="top">
<p>ampl-vsreg --disable 2008</p>
</td>
<td width="148" valign="top">
<p>ampl-vsreg --disable 2010</p>
</td>
</tr>
<tr>
<td width="114" valign="top">
<p>Intel Parallel Inspector</p>
</td>
<td width="148" valign="top">
<p>insp-vsreg disable vs2005</p>
</td>
<td width="148" valign="top">
<p>insp-vsreg disable vs2008</p>
</td>
<td width="148" valign="top">
<p>insp-vsreg disable vs2010</p>
</td>
</tr>
<tr>
<td width="114" valign="top">
<p>Intel Parallel Inspector 2011</p>
</td>
<td width="148" valign="top">
<p>insp-vsreg --disable 2005</p>
</td>
<td width="148" valign="top">
<p>insp-vsreg --disable 2008</p>
</td>
<td width="148" valign="top">
<p>insp-vsreg --disable 2010</p>
</td>
</tr>
</tbody>
</table>
<p><br /><br />For <i>Intel Parallel Composer</i> use the<b> </b><b>Control Panel &gt; Add/Remove Programs</b> for the version you want to disable:</p>
<p><br />Select <b>Modify</b> and disable the following options:</p>
<p>○ Integrated Documentation<br />○ Intel Parallel Debugger Extension<br />○ Integration(s) in Microsoft Visual Studio</p>
<p>            Select <b>Next &gt; Modify<br /></b><b><br /><br /></b></p>
<ol start="2" type="1">
<li>Enable the Visual Studio integration.   </li>
</ol>
<p>For  <i>Intel Parallel Amplifier</i> or <i>Intel Parallel Inspector</i>, start by opening the Command Prompt window for the version of Parallel Studio you wish to enable, then invoke the appropriate script to enable the integration:</p>
<p> </p>
<table width="565" cellpadding="0" cellspacing="0" border="1">
<tbody>
<tr>
<td width="138" valign="top">
<p><b>Tool</b></p>
</td>
<td width="130" valign="top">
<p align="center"><b>Visual Studio 2005</b></p>
</td>
<td width="148" valign="top">
<p align="center"><b>Visual Studio 2008</b></p>
</td>
<td width="148" valign="top">
<p align="center"><b>Visual Studio 2010</b></p>
</td>
</tr>
<tr>
<td width="138" valign="top">
<p>Intel Parallel Amplifier</p>
</td>
<td width="130" valign="top">
<p>ampl-vsreg integrate vs2005</p>
</td>
<td width="148" valign="top">
<p>ampl-vsreg integrate vs2008</p>
</td>
<td width="148" valign="top">
<p>ampl-vsreg integrate vs2010</p>
</td>
</tr>
<tr>
<td width="138" valign="top">
<p>Intel Parallel Amplifier 2011</p>
</td>
<td width="130" valign="top">
<p>ampl-vsreg --integrate 2005</p>
</td>
<td width="148" valign="top">
<p>ampl-vsreg --integrate 2008</p>
</td>
<td width="148" valign="top">
<p>ampl-vsreg --integrate 2010</p>
</td>
</tr>
<tr>
<td width="138" valign="top">
<p>Intel Parallel Inspector</p>
</td>
<td width="130" valign="top">
<p>insp-vsreg integrate vs2005</p>
</td>
<td width="148" valign="top">
<p>insp-vsreg integrate vs2008</p>
</td>
<td width="148" valign="top">
<p>insp-vsreg integrate vs2010</p>
</td>
</tr>
<tr>
<td width="138" valign="top">
<p>Intel Parallel Inspector 2011</p>
</td>
<td width="130" valign="top">
<p>insp-vsreg --integrate 2005</p>
</td>
<td width="148" valign="top">
<p>insp-vsreg --integrate 2008</p>
</td>
<td width="148" valign="top">
<p>insp-vsreg --integrate 2010</p>
</td>
</tr>
</tbody>
</table>
<p> </p>
<p><br />For <i>Intel Parallel Composer</i> use the <b>Control Panel &gt; Add/Remove Programs</b> entry for the version you want to enable:</p>
<p><br />Select <b>Modify</b> and enable the following options:</p>
<p>○ Integrated Documentation<br />○ Intel Parallel Debugger Extension<br />○ Integration(s) in Microsoft Visual Studio</p>
Select the Visual Studio versions you would like to enable integration with.<br />Select <b>Next &gt; Modify</b></div> ]]></description>
      <link>http://software.intel.com/en-us/articles/how-to-change-the-parallel-studio-version-integrated-into-visual-studio/</link>
      <pubDate>Thu, 02 Sep 2010 19:00:00 -0700</pubDate>
      <comments>http://software.intel.com/en-us/articles/how-to-change-the-parallel-studio-version-integrated-into-visual-studio/#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/how-to-change-the-parallel-studio-version-integrated-into-visual-studio/</guid>
      <category>Intel® Parallel Composer</category>
      <category>Intel® Parallel Amplifier</category>
      <category>Intel® Parallel Inspector</category>
      <category>Intel® Software Development Products Home</category>
      <category>Intel® Parallel Studio Home</category>
      <category>Intel® Parallel Advisor</category>
      <category>Intel® Parallel Amplifier Knowledge Base</category>
      <category>Intel® Parallel Composer Knowledge Base</category>
      <category>Intel® Parallel Inspector Knowledge Base</category>
      <category>Intel® Software Development Products Registration Center Knowledge Base</category>
      <category>Intel® Parallel Advisor Knowledge Base</category>
    </item>
    <item>
      <title>Installation Error &amp;#34;HelpLibAgent.exe has stopped working&amp;#34; When Installing or Uninstalling Intel Parallel Studio 2011</title>
      <description><![CDATA[ <p>When installing or uninstalling Intel(R) Parallel Studio 2011 on a system with Microsoft* Visual Studio 2010*, you may see the error message "HelpLibAgent.exe has stopped working":<br /><br /><img src="http://software.intel.com/file/29668" alt="Hlib_Inst_Err.png" title="Hlib_Inst_Err.png" /></p>
<p>Select "<strong>Close the program</strong>" to continue. <br /><br />This error does not prevent the installation or uninstallation of Intel Parallel Studio 2011. It is an issue from a 3rd party tool. It does no harm to your system.</p> ]]></description>
      <link>http://software.intel.com/en-us/articles/installation-error-helplibagentexe-has-stopped-working-when-uninstalling-intel-parallel-studio-2011/</link>
      <pubDate>Wed, 01 Sep 2010 21:00:00 -0700</pubDate>
      <comments>http://software.intel.com/en-us/articles/installation-error-helplibagentexe-has-stopped-working-when-uninstalling-intel-parallel-studio-2011/#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/installation-error-helplibagentexe-has-stopped-working-when-uninstalling-intel-parallel-studio-2011/</guid>
      <category>Intel® Parallel Studio Home</category>
      <category>Intel® Parallel Amplifier Knowledge Base</category>
      <category>Intel® Parallel Composer Knowledge Base</category>
      <category>Intel® Parallel Inspector Knowledge Base</category>
      <category>Intel® Parallel Advisor Knowledge Base</category>
    </item>
  </channel></rss>
