<?xml version="1.0" encoding="UTF-8"?>
<!-- Generated on Sun, 12 Feb 2012 03:30:33 -0800 -->
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <atom:link href="http://software.intel.com/en-us/articles/intel-compilers/type/technical-notes/feed/" rel="self" type="application/rss+xml" />
    <title>Intel Software Network articles Feed</title>
    <link>http://software.intel.com/en-us/articles/intel-compilers/type/technical-notes/</link>
    <description></description>
    <language>en-us</language>
    <item>
      <title>Intel® Parallel Amplifier 2011</title>
      <description><![CDATA[ <!-- CMD code start -->
<link href="http://software.intel.com/sites/products/web2010/css/custom-theme/jquery-ui-1.8.4.custom.css" type="text/css" rel="stylesheet" />
<link href="http://software.intel.com/sites/products/web2010/css/dpdstyle.css" type="text/css" rel="stylesheet" />
<link href="http://software.intel.com/sites/products/web2010/shadowbox-3.0.3/shadowbox.css" type="text/css" rel="stylesheet" />
<link media="screen, projection" href="http://software.intel.com/sites/products/web2010/css/ie.css" type="text/css" rel="stylesheet" />






<!-- CMD code end -->

<div id="wrap"><!-- Navigation --> 
<ul class="nav current-products">
<li class="home"><a href="http://software.intel.com/en-us/intel-sdp-home/">Home</a></li>
<li class="products"><a href="http://software.intel.com/en-us/articles/intel-sdp-products/">Products</a></li>
<li class="news"><a href="http://software.intel.com/en-us/articles/intel-sdp-news/">News</a></li>
<li class="resources"><a href="http://software.intel.com/en-us/articles/intel-sdp-resources/">Resources</a></li>
<li class="support"><a href="http://software.intel.com/en-us/articles/intel-software-developer-support/">Support</a></li>
<li class="store"><a href="http://software.intel.com/en-us/articles/buy-or-renew/">Store</a></li>
</ul>
<!--Top Header -->
<div id="product_component_header">
<table>
<tbody>
<tr>
<td><img height="107" width="13" src="http://software.intel.com/sites/products/web2010/images/transparent.gif" /></td>
<td><a name="pageheader" id="pageheader"></a><span ><b>Intel® Parallel Amplifier 2011 for Windows* - Evaluation</b></span><br /><br /></td>
</tr>
</tbody>
</table>
</div>
<!--End Top Header --><!-- Content Container -->
<div id="tabbox">
<div align="top" class="notab-box-shadow" id="contentwell">
<div >
<table>
<tbody>
<tr>
<td>
<p>In order to install and use the Intel® Parallel Amplifier 2011 for Windows* product, you must have already installed a supported Microsoft development product. Please see the <a target="blank" href="http://software.intel.com/en-us/articles/intel-c-composer-xe-2011-release-notes/">System Requirements</a> for more information. If you do not already have a supported Microsoft development product installed, you may download a <a target="blank" href="http://www.microsoft.com/downloads/en/details.aspx?FamilyID=26bae65f-b0df-4081-ae6e-1d828993d4d0&amp;displaylang=en ">free 90-day trial version of Microsoft Visual Studio 2010*</a> from Microsoft. This will allow for full functionality of the product during your evaluation. If, at the end of evaluation, you choose to purchase Intel® Parallel Advisor 2011 for Windows* for Windows, you must also purchase a license for a supported Microsoft development product if you do not already have one.</p>
<p>If you understand and accept this requirement, click on the Accept button to proceed with your evaluation.</p>
<table align="center" width="300" cellpadding="0" cellspacing="0" border="0" class="sectionBodyText">
<tbody>
<tr>
<td><a href="https://registrationcenter.intel.com/RegCenter/EvalForm.aspx?productid=1467"><img src="http://software.intel.com/file/27597" /></a></td>
<td width="10"></td>
<td><a href="http://software.intel.com/en-us/articles/intel-software-evaluation-center/"><img src="http://software.intel.com/file/27598" /> </a></td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
</div>
</div>
</div>
</div> ]]></description>
      <link>http://software.intel.com/en-us/articles/intel-parallel-amplifier-2011/</link>
      <pubDate>Thu, 09 Feb 2012 00:00:00 -0800</pubDate>
      <comments>http://software.intel.com/en-us/articles/intel-parallel-amplifier-2011/#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/intel-parallel-amplifier-2011/</guid>
      <category>Intel Software Network communities</category>
    </item>
    <item>
      <title>No Selection of &amp;#34;Use Intel C++&amp;#34; in Visual Studio 2010</title>
      <description><![CDATA[ <br />
<div id="art_pre_template"><b>Problem : </b><br />No selection of "Use Intel C++" on the project context menu "Intel C++ Composer XE 2011" even the installation is successful. <br /><br /><strong>Environment : </strong>Any Windows x64 OS<br /><br /><b>Root Cause : </b><br />Many people have been confused by the smaller package "Product for 64-bit(x64) development" with download file name "w_ccompxe_intel64_2011.8.278.exe". One important note is that this 64-bit package contains the "Intel(R) C++ Compiler XE for applications running on Intel-64" and the libraries. <br /><br />On any Windows x64 system it can be used for developing either IA-32 or Intel-64 applications. <br /><br />When you create a new project with Visual Studio 2005, 2008 or 2010, the default configuration is "win32". So if you want to use Intel C++ Compiler to build it, you need to use the "Intel C++ Compiler XE for applications running on IA-32". And this compiler is not packaged in the package "Product for 64-bit(x64) development". <br /><br /><b>Resolution : </b><br /><br />The solution is simple. Download and install one of the following packages: <br />1. the other 32-bit package "Product for 32-bit development"<br />2. the full product package "Product for 32-bit/64-bit (x64) development"</div> ]]></description>
      <link>http://software.intel.com/en-us/articles/no-selection-of-use-intel-c-in-visual-studio-2010/</link>
      <pubDate>Thu, 09 Feb 2012 00:00:00 -0800</pubDate>
      <comments>http://software.intel.com/en-us/articles/no-selection-of-use-intel-c-in-visual-studio-2010/#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/no-selection-of-use-intel-c-in-visual-studio-2010/</guid>
      <category>Intel® C++ Compiler for Windows* Knowledge Base</category>
    </item>
    <item>
      <title>Improving the Compute Performance of Video Processing Software Using AVX (Advanced Vector Extensions) Instructions (by Eli Hernandez and Larry Moore)</title>
      <description><![CDATA[ <h2 class="sectionHeading">Download Article</h2>
Download <a href="http://software.intel.com/file/41302">Improving the Compute Performance of Video Processing Software Using AVX (Advanced Vector Extensions) Instructions</a> [PDF 311KB]<br /> <br />
<h2 class="sectionHeading">Abstract</h2>
Modern x86 CPUs permit instruction level parallelism (e.g. SIMD) on register vectors at most 128-bits. Second Generation Intel® Core™ Processors include the first generation of AVX (256-bit operators), which permits increased parallel processing. This paper outlines a case study in which AVX instructions are used to improve the compute performance of a de-saturation algorithm. The paper also discusses how future integer based AVX instructions might be used to further enhance SIMD optimizations and achieve even greater performance benefits on video processing algorithms.<br /> <br />
<h2 class="sectionHeading">1. Introduction</h2>
Modern x86 CPUs permit Instruction Level Parallelism (ILP), such as Single Instruction Multiple Data (SIMD), on vectors at most 128-bit. These register vectors can be used to process multiple data elements with fewer instructions. Second Generation Intel® Core™ Processors (codenamed Sandy Bridge) included the first generation of AVX, which is a 256-bit instruction set extension to the Intel® Streaming SIMD Extensions (Intel® SSE).<br /> <br /> The first generation of AVX included a wide range of instructions designed primarily to accelerate compute intensive algorithms performing arithmetic operations on floating point data. However, even if an algorithm is integer based, using AVX instructions could potentially increase an algorithm’s performance without sacrificing accuracy of the results. In video processing algorithms, the pixel channels are often stored as 8-bit unsigned integers (bytes) and processed as 32-bit or larger format integer values. Therefore, most video algorithms require conversion of pixels to and from a format. Wider bit widths are used for calculation accuracy and smaller formats are used to save space. Typically, floating-point units are not used because the extra conversion costs do not significantly improve accuracy. However, AVX is capable of greatly improving the runtime performance of video processing software and a vast number of other software applications by the increased parallelism.<br /> <br /> This paper describes a case study in which AVX instructions are used to enhance the performance of a de-saturation algorithm (a common video filter). The case study takes the algorithm from a non-SIMD state to AVX based SIMD. The paper also discusses how future generations of AVX may be able to further aid performance optimization and enable greater performance of video processing.<br /> <br />
<h2 class="sectionHeading">2. Intel SIMD Overview</h2>
On Intel SIMD architectures, a vector register can store a group of data elements of a single data type (e.g. floats or integers). The vector registers of Sandy Bridge are 256 bits wide whereas all other processors since Intel® Pentium III were 128 bits wide. Each vector (called YMM in Sandy Bridge) register can store 8 floats, 8 32-bit integers, 32 chars, etc. AVX instructions operate on the full 256 bits, but SSE can only operate on 128 bits.<br /> <br /> A SIMD enabled-processor can execute a single operation on multiple data. An operation performed simultaneously on multiple data elements is a vector process. SIMD vectorization is the process of converting an algorithm from a scalar to a vector implementation. The multiply function in sample code below is used to illustrate the difference between the scalar and SIMD vector process.<br /> <br /> <img src="http://software.intel.com/file/41320" /><br /> <br />
<p ><img src="http://software.intel.com/file/41321" /></p>
<div ><b>Figure 1:</b><i> This illustrates the difference between scalar and vector processes. The scalar version would have 16 loads, 8 multiplications and 8 stores. SSE can potentially have 4 loads, 2 vector multiplications and 2 stores. AVX would use 2 loads, 1 large vector multiplication and 1 store. The labels with VMUL were shortened to hide the distinction between various versions of vector multiplication instructions. VMUL performs multiplication on vectors A and B for each element pair and stores the results in another vector. Let us suppose for simplicity that loads and stores cost 3 cycles, all multiplication costs 1 cycle and we are ignoring pipelining. Then the scalar version spends 80 cycles to compute 8 elements while the AVX version spends 10 cycles, yielding a theoretical speedup of 8x. This clearly illustrates why SIMD vectorization has become a very important aspect to optimize application performance. Also given observed performance benefits with SIMD, automatic SIMD vectorization has become as keystone feature in advanced compilers.<br /></i></div>
<br />
<h2 class="sectionHeading">3. Video Processing Code</h2>
Typical video processing algorithms calculate pixel values using a triple for-loop (for each frame, for each X, for each Y). This typically is seen as an area of high CPU utilization (i.e. hotspot). Video processing application hotspots are excellent candidates to optimize with AVX.<br /> <br /> A simple approach to optimize using SIMD involves taking advantage of the latest processor technology features, such as AVX. The following sections describe the optimization process using AVX instructions to enhance the performance of a de-saturation algorithm. The serial code implementation is briefly discussed and AVX-based SIMD instructions are used to optimize the de-saturation algorithm. Finally, this chapter ends with our performance results of the optimize code.<br /><br /> <br />
<h2 class="sectionHeading">3.1. Desaturation - Sample Code</h2>
The typical implementation of the Desaturation algorithm uses the incoming pixel values to compute a luminance value. The luminance value is applied to all outgoing pixels to de-saturate images as part of processing video for output.<br /> <br /> As you can see in the sample code below the algorithm traverses row by row to get pixel data, which channel values (blue, green, and red) are used to calculate the luminance value. In other to achieve high accuracy the algorithm converts the one-byte channel values to single precision floating point. The floating-point values are used in a dot product type of operation to compute the luminance value. The Desaturation sample algorithm uses the fLuminace(…) function to convert pixel channel values from byte to float. The conversion to float is achieved implicitly by typecasting each channel value to float and with weights as constants for Red, Green and Red, the fLuminance(…) function uses the float values to compute luminance which value is applied to the video output.<br /> <br /> <img src="http://software.intel.com/file/41322" /><br /> <img src="http://software.intel.com/file/41323" /><br /> <br /> Note that the conversion of channel data from byte to float occurs implicitly by typecasting to float. Although the scalar code looks simple and trivial, the assembly code generated by the compiler is much more complex. In analysis of the generated assembly code, the implicit byte to float conversion can be performed with fewer instructions by using the more efficient AVX instructions. As we have observed, the serial code calculates one channel and one pixel at a time. Nothing is computed in parallel (ignoring pipelining and reordering).  Refer to <b>Appendix A</b> for the assembly code.<br /><br /> <br />
<h2 class="sectionHeading">3.2. Desaturation - Optimization with AVX</h2>
This section outlines the transformation of the serial code and describes how AVX, SSE4.1 and SSE2 instructions optimize the de-saturation algorithm. As illustrated in Chapter 2 with SIMD, we can work on many items at once. Therefore, the load, store, conversion and math operations can be done in parallel. The algorithm below describes how we can use instruction level parallelism (via AVX instructions) to significantly improve performance. Note that the algorithm is written with the restriction that we could only use available instructions, not idealistic for future instructions as we discuss later. Therefore, lines 19, 20 and 21 involve an intermediate step to convert 32-bit integers back down to 8-bit unsigned and etcetera.<br /> <br /> <img src="http://software.intel.com/file/41332" /><br /> <img src="http://software.intel.com/file/41333" /><br /> <br />
<div ><b>Figure 2. </b><i>De-saturation algorithm</i><br /><br /></div>
<br /> With the Figure 2 as the backbone of de-saturate, we can implement the real code. The motivations for using a procedure similar to Figures 2 and 3 are that:<br /> <br /> 
<ul>
<li>AVX provides greater throughput for parallel processing of single-precision floating- point units than any past Intel SIMD x86 extension (MMX, SSE, SSE2, SSE3, SSE4.1, SSE4.2).</li>
<li>The cost (SIMD) to cast byte (8-bit unsigned char) to integer (32-bit signed integer) to single precision floating point (32-bit float) and back is less than using multiple calls of the equivalent code (scalar) using just bytes or integers.</li>
<li>Using byte based SIMD with this procedure gives poor precision. Parallel performance is not considered.</li>
<li>Using integer based SIMD with this procedure gives acceptable precision. Current AVX instructions for integer arithmetic do not exist and therefore cannot take full advantage of the 256-bit registers.</li>
<li>Using float based SIMD with this procedure gives very good precision and offers higher performance than those described above.</li>
</ul>
<img src="http://software.intel.com/file/41334" /><br /> <img src="http://software.intel.com/file/41335" /><br /> <br /> <b>Figure 3. </b><i>De-saturation code optimzed AVX<sup>1</sup></i><br /> <br /> The algorithm and AVX code shown in Figures 2 and 3 convey the same exact process line-for-line. Notice that only lines 9 and 16 involve doing the real work. Theses lines each process 8 single precision floating point multiplications in parallel, totalling 16 multiplications for 2 instructions versus 16 individual multiplication instructions. Everything else is unnecessary overhead to make use of the parallel instructions or to increase precision.<br /> <br /> Despite the overhead, this code still improves performance by 1.45x . If integer based instructions existed with equivalent parallelism to that of single precision floating point, we could further increase performance. In such case, lines 6, 8, 10, 11, 14, 15, 17 and 18 could be eliminated. Lines 9 and 16 would operate on integers instead. Lines 19, 20 and 21 could require a single pack instruction (integer to byte). Of course, there are other hypothetical instructions that could be introduced with future AVX generations. The potential performance gain is left as an exercise for the reader. For assembly instructions generated by intrinsic functions used in the inner [ix] loop, refer to <b>Appendix B</b>.<br /><br /> <br />
<h2 class="sectionHeading">3.3	Desaturation - Performance Test Results<b><sup>3</sup></b></h2>
Performance assessment of the de-saturation algorithm optimized with AVX in this study observed a 1.45x speedup when compared to the serial code. To gather performance data the de-saturation algorithm was applied to a 1440x1080 image and was looped 100 times. Performance was measured in elapsed time (milliseconds) taken to de-saturate the image, the following performance numbers were consistently observed:<br /> <br /> 
<table border="0" cellpadding="10" cellspacing="0">
<tbody>
<tr>
<td>Serial Code:</td>
<td>1264 milliseconds</td>
</tr>
<tr>
<td>Code with AVX:</td>
<td>873 milliseconds</td>
</tr>
<tr>
<td>Performance Scaling:</td>
<td>1.45x or 1264ms/873ms</td>
</tr>
</tbody>
</table>
<br /> A kernel (small application program) was used to run the algorithm. A kernel with a 1.45x scaling typically translates to a performance improvement of 10% to 15% when measured at the workload level. However, for video processing this rule of thumb does not apply. Consider you are applying the de-saturation algorithm to a one-minute or longer video clip. In that case, there will be more than 100 frames (images) to process. In theory, since more data has be processed, the performance boost potential could be more than 1.45x especially if or when processing full High-Definition (e.i.,1920x1080) video.<br /><br /> <br />
<h2 class="sectionHeading">4. Packed Integer Conversion Instructions</h2>
Since our optimized de-saturation algorithm uses one of the SSE4.1 instructions, we will give an overview of SSE4.1 because other SSE4.1 instructions may be applicable for the optimization of other video processing algorithms. The Packed Integer Conversion instruction set contains 12 instructions for packed integer bit width conversions. Any of which can be utilized to optimize code where bit width is to be increased for integer data.<br /> <br /> The table in <b>Figure 5</b> lists the SSE4.1 instructions for packed integer conversions. The instructions support sign extension and zero extension conversions of byte to word, byte to double word, byte to quad-word, word to double word, word to quad-word, and double word to quad-word. Additionally, the chart shows a comparison of SSE2 vs SSE4.1 instructions needed to convert four (4) one-byte integers to four (4) 32-bit integers.<br /> <br /> The <i>pmovzxbd (byte to double word)</i> instruction was utilized a total of four (4) times in the de-saturate optimization. When/if these instructions include support for full 256-bit register, the use of this instruction in the optimized algorithm will be reduced to two (2). Thereby further improving the loop performance.<br /> <br /> <img src="http://software.intel.com/file/41328" /><br /> <b>Figure 4. </b><i>Instructions for bit width conversions of packed integers</i><br /> <br /> The source operand to packed integer conversion instructions is from either an XMM register or memory. The destination is always an XMM register. When accessing memory, no alignment is required, unless alignment checking is enabled. In which case, all conversions must be aligned to the width of the memory being referenced. The number of elements that can be converted and width of memory reference is illustrated in <b>Figure 5</b>. The alignment requirement is shown in parenthesis.<br /> <br /> <img src="http://software.intel.com/file/41329" /><br /> <b>Figure 5. Number of elements to process.</b> <i>P is Packed. MOV is Move (copy register). ZX is Zero Extend. SX is Sign Extend. B is Byte. W is Word. D is Double Word. Q is Quad-Word.</i><br /><br /> <br />
<h2 class="sectionHeading">5. Conclusion</h2>
This paper has discussed how Second Generation Intel® Core™ Processors could increase parallel processing via AVX instructions and 256 bit registers. This paper outlined a case study in which AVX instructions were used to improve the compute performance of a de-saturation algorithm. The paper also discussed how future integer based AVX instructions could be used to further enhance SIMD optimizations and achieve even greater performance benefits on video processing algorithms. The procedure described demonstrated how AVX instructions or their intrinsic functions could be utilized to improve the runtime performance of video processing applications. The paper documented that despite some overhead incurred to setup for SIMD processing, the video de-saturation still achieved excellent performance benefits. <br /><br /> <br />
<h2 class="sectionHeading">About the Authors</h2>
Eli Hernandez is an Application Engineer in the Consumer Client and Power Enabling Group at Intel Corporation where he works with customers to optimize their software for power efficiency and to run best on Intel hardware and software technologies. Eli joined Intel in August of 2007 with over 12 years of experience in software development for the telecom and the chemical industry. He received his B.S. in Electrical Engineering in 1989 and completed Master Studies in Computer Science in 1991-1992 from the DePaul University of Chicago.<br /><br /> In 2008, Larry Moore graduated from Saint Petersburg College with Honors. He received a Who's Who Among Students Award and was a member of Phi Theta Kappa Honor Society. In 2011, he spent 8 months at Intel as an application engineer intern, in DuPont, Washington. Currently, he is attending the University of South Florida at Tampa, Florida in an accelerated graduate program, pursuing both a Bachelor of Science and Master of Science in Computer Engineering. His current research involves computer aided verification of real-time systems and model checking. Larry is also a member of IEEE Computer Society. <br /><br /> <br />
<h2 class="sectionHeading">Appendix A:  Inner loop quivalent assembly of the serial code</h2>
Roughly 45 instructions to proccess an iteration of the algorithm inner loop. With throughput of 1 pixels processed per iteration.<br /> <br /> <img src="http://software.intel.com/file/41330" /><br /> <br />
<h2 class="sectionHeading">Appendix B: Equivalent assembly of inner loop optimized with AVX</h2>
Roughly 30 instructions to proccess an iteration of the algorithm inner loop. With throughput of 4 pixels processed per iteration.<br /> <br /> <img src="http://software.intel.com/file/41331" /><br /> <br /> <sup>1</sup> Load and store operations on optimized code assumes data is aligned.<br /> <sup>2</sup> Please see footnote 3 and section 3.3.<br /> <sup>3</sup> The performance measurements in this section are the actual numbers from real tests. However, we do not guarantee you will achieve as good of a performance.<br /> <br />
<div id="vc-meta" >
<div id="vc-meta-author">
<div></div>
</div>
<div id="vc-meta-pubdate">02-08-2012</div>
<div id="vc-meta-modificationdate">02-08-2012</div>
<div id="vc-meta-taxonomy">Tech Articles</div>
<div id="vc-meta-category-product">
<div></div>
</div>
<div id="vc-meta-category">
<div></div>
</div>
<div id="vc-meta-thumb">http://software.intel.com/file/41303</div>
<div id="vc-meta-abstract">This paper describes a case study in which AVX instructions are used to enhance the performance of a de-saturation algorithm (a common video filter). The case study takes the algorithm from a non-SIMD state to AVX based SIMD. The paper also discusses how future generations of AVX may be able to further aid performance optimization and enable greater performance of video processing.</div>
</div> ]]></description>
      <link>http://software.intel.com/en-us/articles/improving-the-compute-performance-of-video-processing-software-using-avx-advanced-vector-extensions-instructions/</link>
      <pubDate>Wed, 08 Feb 2012 00:00:00 -0800</pubDate>
      <comments>http://software.intel.com/en-us/articles/improving-the-compute-performance-of-video-processing-software-using-avx-advanced-vector-extensions-instructions/#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/improving-the-compute-performance-of-video-processing-software-using-avx-advanced-vector-extensions-instructions/</guid>
      <category>Visual Computing</category>
      <category>Visual Computing Source</category>
    </item>
    <item>
      <title>Using Intel Cluster Checker to check that MPI applications will properly run over Infiniband</title>
      <description><![CDATA[ <p class="MsoNormal">One of the benefits of Intel Cluster Checker is that it acts as an application proxy. If the tool passed, then there is a high probability of an MPI application running properly.<o:p></o:p></p>
<p class="MsoNormal">To ensure this, the following exhaustive steps are enforced by Intel Cluster Checker test modules:<o:p></o:p></p>
<p class="MsoListParagraphCxSpFirst" > </p>
<ol>
<li><span >·<span > </span></span><span >Check that base libraries and their uniformity (<b>base_libraries</b>)</span></li>
<li><span >·<span > </span></span><span >Check that MPI tools have consistent paths (<b>mpi_consistency</b>)</span></li>
<li><span >·<span > </span></span><span >Check that per-node MPI jobs can do Hello World independently (<b>intel_mpi_rt</b>)</span></li>
<li><span >·<span > </span></span><span >Check that a global Hello World is successfully executed across compute nodes (<b>intel_mpi_rt_internode</b>)</span></li>
<li><span >·<span > </span></span><span >Runs Intel MPI Benchmarks such as Ping Pong to check available latency and bandwidth (<b>imb_pingpong_intel_mpi</b>)</span></li>
<li><span >·<span > </span></span><span >Stress the communication system by running the HPCC benchmark (<b>hpcc</b>)</span></li>
</ol><!--[if !supportLists]--><o:p></o:p>
<p> </p>
<p class="MsoListParagraphCxSpMiddle" ><o:p></o:p></p>
<p class="MsoListParagraphCxSpMiddle" ><o:p></o:p></p>
<p class="MsoListParagraphCxSpMiddle" ><o:p></o:p></p>
<p class="MsoListParagraphCxSpMiddle" ><o:p></o:p></p>
<p class="MsoListParagraphCxSpLast" ><o:p></o:p></p>
<p class="MsoNormal">If the tool reports something, then an MPI application might have issues to complete their work.<o:p></o:p></p>
<p class="MsoNormal">These steps will even catch potential timeouts due wrong configuration on the network stack; and most important, bad cabling or down hardware interfaces. However, if the cluster uses InfiniBand adapters then there is a known issue to be aware of. The global MPI check can hang as any other MPI application will do if InfiniBand is not correctly configured and online.<o:p></o:p></p>
<blockquote>
<p class="MsoNormal"><span >Intel(R) MPI Library Runtime Environment (All nodes), (intel_mpi_rt_internode, 1.8.....................................................</span><span >^C</span></p>
<p class="MsoNormal"><span >Caught signal INT, cleaning before termination.<o:p></o:p></span></p>
</blockquote>
<p class="MsoNormal">With InfiniBand setups, the configuration of Intel Cluster Checker must define openib and dat_conf as dependencies of intel_mpi_rt_internode. This action will ensure that the InfiniBand devices are properly detected and healthy. openib check hardware devices, and dat_conf the DAPL software interface.<o:p></o:p></p>
<blockquote>
<p class="MsoNormal">&lt;intel_mpi_rt_internode&gt;<o:p></o:p></p>
<p class="MsoNormal">&lt;add_dependency&gt;dat_conf&lt;/add_dependency&gt;<o:p></o:p></p>
<p class="MsoNormal">&lt;add_dependency&gt;openib&lt;/add_dependency&gt;<o:p></o:p></p>
<p class="MsoNormal">&lt;/intel_mpi_rt_internode&gt;<o:p></o:p></p>
</blockquote>
<p class="MsoNormal">This decision cannot be done automatically as choosing were to use or not the low latency, high bandwidth capabilities of InfiniBand during the check is at discretion of the user. For instance, the administrator may want to double check that an Ethernet fabric can be properly used to run MPI applications.<o:p></o:p></p>
<p class="MsoNormal">Be aware that this manual requirement may be lifted in the near future.<o:p></o:p></p> ]]></description>
      <link>http://software.intel.com/en-us/articles/using-intel-cluster-checker-to-check-that-mpi-applications-will-properly-run-over-infiniband/</link>
      <pubDate>Tue, 07 Feb 2012 19:00:00 -0800</pubDate>
      <comments>http://software.intel.com/en-us/articles/using-intel-cluster-checker-to-check-that-mpi-applications-will-properly-run-over-infiniband/#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/using-intel-cluster-checker-to-check-that-mpi-applications-will-properly-run-over-infiniband/</guid>
      <category>Parallel Programming</category>
      <category>Intel® Cluster Ready</category>
      <category>Tools</category>
      <category>Intel Software Network communities</category>
      <category>Intel Software Network communities</category>
      <category>Resources For Software Developers</category>
    </item>
    <item>
      <title>Video Conferencing features of Intel® Media Software Development Kit</title>
      <description><![CDATA[ <h2 class="sectionHeading">Download Article</h2>
Download <a href="http://software.intel.com/file/41357">Video Conferencing features of Intel® Media Software Development Kit</a> [PDF 568KB]<br /> <br />
<h2 class="sectionHeading">Abstract</h2>
This article explains how to use the new and optimized video conferencing features available as part of Intel® Media SDK 2012. Features common to video conferencing or streaming workloads are detailed together with source code references illustrating how a developer may use the feature in an application.<br /><br /> <br />
<h2 class="sectionHeading">Introduction</h2>
The Intel® Media Software Development Kit (Intel® Media SDK) is a software development library that exposes the media acceleration capabilities of Intel platforms for video decoding, video encoding, and video pre/post processing. Intel Media SDK helps developers rapidly develop software that accesses hardware acceleration for video codecs with automatic fallback on software if hardware acceleration is not available.<br /> <br /> Intel Media SDK is available free of charge and can be downloaded from here: <a href="http://www.intel.com/software/mediasdk/">www.intel.com/software/mediasdk/</a><br /> <br /> Intel Media SDK features a wide range of application samples that illustrate how to use the SDK to encode, decode and transcode media to/from elementary video streams. Additionally, Intel Media SDK 2012 has added new samples illustrating how to use the SDK in the context of video conferencing.<br /> <br /> This article explains how to use the new and optimized video conferencing features of Intel Media SDK 2012. The features listed below addresses common video conferencing or streaming requirements for improved adaptation to transmission conditions, robustness and real-time responsiveness:<br /> <br /> 
<ul>
<li>Low Latency Encode and Decode</li>
<li>Dynamic Bit Rate Control</li>
<li>Dynamic Resolution Control</li>
<li>Forced Key Frame Generation</li>
<li>Reference List Selection</li>
<li>Reference Picture Marking Repetition SEI message</li>
<li>Long Term Reference Frame (LTR)</li>
<li>Temporal Scalability</li>
<li>Motion JPEG (MJPEG) Decode</li>
</ul>
<span >It is important to note that the majority of the above features were designed for the Media SDK AVC (H.264) codec. However, the MPEG2 encoder does support dynamic resolution control and dynamic bit rate control. Low latency MPEG2 encode or decode has not been optimized.</span><br /> <br /> In the following chapters we will explain how a developer may use these new SDK features in an application. For further details on how to use the features, please refer to the Intel Media SDK 2012 manual and samples.<br /><br /> <br />
<h2 class="sectionHeading">Intel® Media SDK</h2>
Intel® Media SDK supports hardware accelerated and software optimized media libraries for video encode, decode and processing functionality on Intel platforms. The optimized media libraries are built on top of Microsoft DirectX*, DirectX Video Acceleration (DVXA) APIs and platform graphics drivers. Intel® Media SDK exposes the hardware acceleration features of Intel® Quick Sync Video (QSV) built into 2nd generation Intel® Core™ processors. <a href="http://www.intel.com/technology/quicksync/index.htm">http://www.intel.com/technology/quicksync/index.htm</a><br /> <br /> The figure below provides a high level overview of where Intel® Media SDK fits into the software stack.<br /> <br />
<p ><img src="http://software.intel.com/file/41358" /></p>
<div ><b>Figure 1 - </b><i>Overview of Intel® Media SDK</i><br /></div>
<br /> For extensive details on all the features of Intel Media SDK please refer to the manual provided with the SDK.<br /><br /> <br />
<h2 class="sectionHeading">Intel® Media SDK Video Conferencing features</h2>
In this chapter we will detail features of the Intel® Media SDK that are important to a developer intending to build video conferencing or streaming workload types of applications.<br /> <br /> The figures below shows a very simplified one-way video conferencing pipeline including Intel Media SDK DirectShow sample encoder and decoder filters in GraphEdit (from Microsoft* Windows SDK). A similar setup can be used to validate basic features of the SDK codecs such as low latency and dynamic bit rate control.<br /> <br />
<p ><img src="http://software.intel.com/file/41359" /></p>
<div ><b>Figure 2 - </b><i>Conceptual one way video conferencing pipeline</i><br /></div>
<br />
<p ><img src="http://software.intel.com/file/41360" /></p>
<div ><b>Figure 3 - </b><i>Microsoft* GraphEdit pipeline</i><br /></div>
<br /> A real life video conferencing application includes components for reverse pipeline, network transfer, preview, side band channels, et cetera. These components are all out of scope of this article.<br /> <br /> Note that many of the features described in this document are new features added in Intel Media SDK 2012 (API 1.3). To ensure access to the described features please make sure to initialize the SDK session specifying API 1.3 such as:<br /> <br />
<pre><code>mfxVersion ver = {3, 1};
MFXInit(MFX_IMPL_AUTO, ver, session);</code></pre>
<br />Note that Intel® Media SDK API 1.3 HW acceleration will not be available until first public Intel graphics driver for next generation Intel® Core™ platforms is released, early 2012.<br /> <br /> Intel® Media SDK provides sample code on how to use the new video conferencing features:<br /> <br /> 
<table class="table-padding" border="1" cellpadding="0" cellspacing="0">
<tbody>
<tr>
<td><b>sample_decode</b></td>
<td>Low latency decode</td>
</tr>
<tr>
<td><b>sample_videoconf</b></td>
<td>H.264 encoder configurations illustrating use of low latency, LTR, reference lists, key frame generation et cetera.</td>
</tr>
<tr>
<td><b>sample_dshow_plugins</b></td>
<td>Microsoft DirectShow® filter implementations. New low latency preset added.</td>
</tr>
</tbody>
</table>
<br /><br />
<h2 class="sectionHeading">Low Latency Encode and Decode</h2>
Low latency codecs improve real-time responsiveness. This is achieved by minimizing internal codec delay and buffering.<br /> <br /> To enable low latency mode using Intel® Media SDK the developer must configure the encoder and decoder with a specific set of parameters.<br /> <br /> The sample Intel® Media SDK Microsoft* DirectShow H.264 encoder and decoder filters enable this feature by introducing a new “Low Latency” preset (PRESET_LOW_LATENCY) which configures the encoder with appropriate parameters to minimize latency. See encoder filter dialog box screenshot below.<br /> <br />
<p ><img src="http://software.intel.com/file/41361" /></p>
<div ><b>Figure 4 - </b><i>Microsoft* DirectShow H.264 Encode sample filter - Low Latency preset</i><br /></div>
<br /> Note, by enabling the low latency encoder filter preset it also enables the bit rate to be controlled dynamically via the filter properties UI.<br /> <br /> <b>Encoder configuration</b><br /> <br /> To enable the Intel® Media SDK encoder for optimal low latency the following set of parameters should be used:<br /> <br />
<pre><code>
mfxVideoParam::AsyncDepth = 1
mfxInfoMFX::GopRefDist = 1
mfxInfoMFX::NumRefFrame = 1
</code></pre>
<br /><br /> The <code>AsyncDepth</code> setting limits internal frame buffering. This also requires the application to synchronize after decoding or encoding or each frame. <code>GopRefDist</code> setting forces encoder to not use B-frames. <code>NumRefFrame</code> has the effect of only using previous P-frame as reference.<br /> <br /> The encoder must also be configured to use the extended buffer type <code>mfxExtCodingOption</code> (<code>MFX_EXTBUFF_CODING_OPTION</code>), with specific setting for decoder frame/picture buffering (DPB), to ensure that decoded frame gets displayed immediately after decoding:<br /> <br />
<pre><code>
mfxExtCodingOption::MaxDecFrameBuffering = 1
</code></pre>
<br /><br /> <b>Decoder configuration</b><br /> <br /> To enable the Intel® Media SDK decoder for low latency the following set of parameters should be used:<br /> <br />
<pre><code>
mfxVideoParam::AsyncDepth = 1
</code></pre>
<br /><br /> The <code>AsyncDepth</code> setting limits internal frame buffering. This also requires the application to synchronize after decoding or encoding or each frame. <br /> <br /> The decoder bit stream <code>DataFlag</code> should also be set to indicate that a full frame is in buffer. Note that if full frame is not in decoder bit stream buffer, the decoded frame will have artifacts.<br /> <br />
<pre><code>
mfxBitStream::DataFlag = MFX_BITSTREAM_COMPLETE_FRAME
</code></pre>
<br /><br /> It is also suggested that the decoder bit stream buffer is only provided one frame at a time.<br /> <br />
<h2 class="sectionHeading">Dynamic Bit Rate Control</h2>
To be able to adapt to varying network transmission conditions it is important that an encoder has the capability to adjust bit rate at any time during an encoding session.<br /> <br /> An application can change bit rate using the <code>TargetKbps</code> and/or <code>MaxKbps</code> parameter by calling the <code>MFXVideoENCODE_Reset</code> function at any time during encode operation.<br /> <br /> If Hypothetical Reference Decoder (HRD) compliance is required then <code>mfxExtCodingOption::NalHrdConformance</code> should be set (<code>MFX_CODINGOPTION_ON</code>). In that case bit rate change is only allowed in Variable Bit Rate (VBR) mode and the encoder will also generate a key frame every time the bit rate is changed.<br /> <br /> In case HRD compliance is not required, bit rate can also be changed in Constant Bit Rate (CBR) and Average Variable Bit Rate (AVBR) mode, by setting <code>mfxExtCodingOption::NalHrdConformance</code> to off, <code>MFX_CODINGOPTION_OFF</code> (this is also the default setting). This mode also eliminates key frame generation every time the bit rate is changed. However, if key frame generation is required please follow the method described in the key frame generation section.<br /> <br /> Alternatively, the application may use the Constant Quantization Parameter (CQP) encoding mode to perform customized bit rate adjustment on a per-frame basis. For more information please refer to the Intel Media SDK video conferencing sample.<br /> <br /> MPEG2 encoder usage note: Dynamic bit rate change will always result in generation of key frame.<br /> <br />
<h2 class="sectionHeading">Dynamic Resolution Control</h2>
The Intel Media SDK encoder supports dynamic resolution change in all bit rate control modes. The application may change resolution by calling <code>MFXVideoENCODE_Reset</code> function.<br /> <br /> Note that the application cannot increase resolution beyond the size specified during encoder initialization.<br /> <br /> The encoder does not guarantee HRD conformance on resolution change and always results in insertion of key frame.<br /> <br />
<h2 class="sectionHeading">Forced Key Frame Generation</h2>
The ability to insert key frames at any time during encoding enables greater control over stream quality robustness and error correction.<br /> <br /> Encoder frame type control depends on the selected encoder order mode:<br /> <br /> 
<ul>
<li>Display Order: The application can enforce any current frame to be key frames, but cannot change the frame type of already buffered frames inside the encoder</li>
<li>Encoded Order: The application must exactly specify the frame type for every frame thus the application can enforce the current frame to be any frame type that standards allow</li>
</ul>
To control the encoded frame type the application can set the <code>FrameType</code> parameter of the <code>mfxEncodeCtrl</code> structure. <code>mfxEncodeCtrl</code> structure reference is used as the first parameter of the <code>MFXVideoENCODE_EncodeFrameAsync</code> call and allows the developer additional control over the encoding operation. Key frame generation control is illustrated in below example:<br /> <br />
<pre><code>
mfxEncodeCtrl EncodeCtrl;
memset(&amp;EncodeCtrl, 0, sizeof(mfxEncodeCtrl));
EncodeCtrl.FrameType = 
   MFX_FRAMETYPE_I | MFX_FRAMETYPE_REF | MFX_FRAMETYPE_IDR;
MFXVideoENCODE_EncodeFrameAsync(&amp;EncodeCtrl, …);
</code></pre>
<br /><br />
<h2 class="sectionHeading">Reference List Selection</h2>
The Reference List Selection feature is useful if the encoder application can obtain feedback about client side frame reception conditions. Based upon this information the application may want to adjust the encoder to use or not use certain frames as reference to improve robustness and error resilience.<br /> <br />
<p ><img src="http://software.intel.com/file/41362" /></p>
<div ><b>Figure 5 - </b><i>Reference frame feedback</i><br /></div>
<br /> The application can specify the reference window size by specifying the parameter <code>mfxInfoMFX::NumRefFrame</code> during encoding initialization. Depending on platform, there is a limitation on how big the size of the reference window can be. To determine the actual parameter set after initialization, use the function <code>MFXVideoENCODE_GetVideoParam</code> to retrieve the current working set of parameters (including actual <code>NumRefFrame</code> used). Also note that the size of the reference window also depends on the selected codec profile/level and resolution.<br /> <br /> During encoding, the application can specify the actual reference window sizes by attaching the <code>mfxExtAVCRefListCtrl (MFX_EXTBUFF_AVC_REFLIST_CTRL)</code> structure to the <code>MFXVideoENCODE_EncodeFrameAsync</code> function. Note that <code>mfxExtAVCRefListCtrl</code> is used as extended buffer in the mfxEncodeCtrl structure. The NumRefIdxL0Active parameter of the <code>mfxExtAVCRefListCtrl</code> structure specifies the size of the reference list L0 (for B and P frame prediction according to AVC standard) and the <code>NumRefIdxL1Active</code> parameter specifies the size of the reference list L1 (for B frame prediction according to AVC standard). These two values, specifies the actual size of the reference lists, and must be less or equal to the parameter <code>mfxInfoMFX::NumRefFrame</code> that was set during encoding initialization.<br /> <br /> Using the same extended buffer, the application can also instruct the encoder to use or not use certain reference frames. The application specifies the preferred reference frame list <code>PreferredRefList</code> and/or the rejected frame list <code>RejectedRefList</code> in the <code>mfxExtAVCRefListCtrl</code> structure. The two lists control how the encoder chooses the reference frames of the current frame.<br /> <br /> There are a few limitations:<br /> <br /> 
<ul>
<li>Application must uniquely identify each input frame, by setting the <code>mfxFrameData::FrameOrder</code> parameter.</li>
<li>The frames in the lists are ignored if they are out of the reference window. </li>
<li>If by going through the lists, the SDK encoder cannot find a reference frame for the current frame, the SDK encoder will encode the current frame using Intra prediction only.</li>
<li>If the GOP pattern contains B-frames, the SDK encoder will not be able to follow the <code>mfxExtAVCRefListCtrl</code> instructions (the instructions will be ignored).</li>
<li>Reference list control is only supported in progressive encoding mode.</li>
</ul>
Make sure to set <code>FrameOrder = MFX_FRAMEORDER_UNKNOWN</code> to mark unused reference list items.<br /> <br /> For instance, to indicate to the encoder, that is about to encode frame 100, that frame 98 and 99 was received as corrupted frames on the decoder client side, the reference list can be specified as follows (assumes proper initialization of unused frames):<br /> <br />
<pre><code>
RejectedRefList[0].FrameOrder = 98;
RejectedRefList[0].PicStruct = MFX_PICSTRUCT_PROGRESSIVE;
RejectedRefList[1].FrameOrder = 99;
RejectedRefList[1].PicStruct = MFX_PICSTRUCT_PROGRESSIVE;
</code></pre>
<br /><br /> Similar code applies to setting <code>PreferredRefList</code>, resulting in reordering the reference list for the currently encoded frame.<br /> <br />
<h2 class="sectionHeading">Reference Picture Marking Repetition SEI Message</h2>
As with reference list selection, improved robustness and error resilience can be achieved by using the Reference Picture Marking Repetition Supplemental Enhancement Information (SEI) message feature, as defined by the AVC standard (D.1.8).<br /> <br /> The message is used to repeat the decoded reference picture marking syntax structures in the earlier decoded pictures. Consequently, even earlier reference pictures were lost, the decoder can still maintain correct status of the reference picture buffer and reference picture lists.<br /> <br /> The application can request writing the Reference Picture Marking Repetition SEI message during encoding initialization, by setting the <code>RefPicMarkRep</code> flag to <code>MFX_CODINGOPTION_ON</code> in the <code>mfxExtCodingOption (MFX_EXTBUFF_CODING_OPTION)</code> extended buffer.<br /> <br /> The decoder will respond to the reference picture marking repetition SEI message if such message exists in the bitstream, and check with the reference list information specified in the sequence/picture headers. The decoder will report any mismatch of the SEI message with the reference list information via the <code>mfxFrameData::Corrupted field</code>.<br /> <br />
<h2 class="sectionHeading">Long Term Reference Frame</h2>
An application may use a Long-Term Reference (LTR) frame to improve coding efficiency. For instance, LTR may be useful if a certain pattern is continuously part of frame background over long period of time. Or to store a representation of a camera view when switching to another camera, then enabling better prediction when switching back to prior camera view. Assigning an LTR allows the encoder to tell the decoder to hold onto a frame longer than it would as a short-term reference frame.<br /> <br /> Unlike a short-term reference frame (controlled by the encoder), an LTR frame is controlled entirely by the application. The encoder itself never marks or unmarks frame as an LTR.<br /> <br /> Each frame has a unique number <code>FrameOrder in mfxFrameData</code> structure and the application uses this to identify frame during the marking process.<br /> <br /> The application uses the <code>mfxExtAVCRefListCtrl</code> buffer to mark frame as LTR and later to unmark it. To mark a frame as LTR put its number (<code>FrameOrder</code>) in <code>mfxExtAVCRefListCtrl::LongTermRefList</code> list. After marking as LTR, the encoder will use this LTR frame as reference for all consecutive frames until the frame is unmarked. To unmark a frame put its number in <code>mfxExtAVCRefListCtrl::RejectedRefList</code> list. LTR will also be automatically unmarked by IDR frame. <br /> <br /> Note that a frame can only be marked as LTR if it is present inside the encoder frame buffer.<br /> <br /> The encoder puts all long-term reference frames at the end of a reference frame list. If the number of active reference frames (the <code>NumRefIdxL0Active</code> and <code>NumRefIdxL1Active</code> values in the <code>mfxExtAVCRefListCtrl</code> extended buffer) is smaller than the total reference frame number (the <code>NumRefFrame</code> value in the <code>mfxInfoMFX</code> structure during the encoding initialization), the SDK encoder may ignore some or all long term reference frames. The application may avoid this by providing list of preferred reference frames in the <code>PreferredRefList</code> list in the <code>mfxExtAVCRefListCtrl</code> extended buffer. In this case, the SDK encoder reorders the reference list based on the specified list.<br /> <br /> For instance, to set frame 100 as an LTR frame, initialize the reference list as follows (assumes proper initialization of unused frames):<br /> <br />
<pre><code>
LongTermRefList[0].FrameOrder = 100;
LongTermRefList[0].PicStruct = MFX_PICSTRUCT_PROGRESSIVE;
</code></pre>
<br /><br />
<h2 class="sectionHeading">Temporal Scalability</h2>
Temporal scalability is stream scalability in terms of frame rate, meaning that a given bit stream has the ability to have multiple frame rates.<br /> <br /> For instance, a stream may have a base layer frame rate of 7.5 fps. Additional temporal layers may have frame rate of 15, 30 and 60 fps allowing improved error resiliency in decoder in case of packet loss/frame corruption by lowering the frame rate while maintaining the quality (note that some specifications instead define the greatest rate layer to be the base layer, such as 60 fps in this example).<br /> <br /> Temporal scalability is achieved by encoding stream in such a way that frames can be skipped during decoding since they do not have other frames depending on them, thus adjusting decoded frame rate. This is illustrated in a simplified temporal stream example below where max frame rate is 60 fps.<br /> <br />
<p ><img src="http://software.intel.com/file/41364" /></p>
<div ><b>Figure 6 - </b><i>Temporal scalability - Frame dependencies</i><br /></div>
<br /> In the above figure, consider green frames (1, 3, 5 etc.), no frame is dependent on them, therefore they could be skipped and all remaining frames could be decoded, thus cutting the frame rate by a factor of 2 from 60fps to 30fps. Since the green frames are skipped you could also skip the blue frames since no frame dependency remains, thus cutting the frame rate in half again, resulting in 15fps. In the same way decoder can skip black frames, resulting in 7.5fps.<br /> <br /> It’s important to understand that the Media SDK decoder does not support layers selection. Application must interpret (encoded bit stream header) and decide what temporal layers to decode.<br /> <br /> <b>Usage</b><br /> <br /> The application may specify temporal hierarchy of frames by using the <code>mfxExtAvcTemporalLayers</code> (<code>MFX_EXTBUFF_AVC_TEMPORAL_LAYERS</code>) extended buffer. This functionality is limited to display order mode.<br /> <br /> To distinguish different temporal layers, the encoder inserts prefix Network Abstraction Layer (NAL) unit before each slice with unique temporal and priority IDs. The encoder starts temporal IDs from zero and priority IDs from <code>BaseLayerPID</code> increasing both of them by one for each consecutive layer.<br /> <br /> If the application additionally needs to specify unique sequence or picture parameter sets IDs it should use mfxExtCodingOptionSPSPPS (MFX_EXTBUFF_CODING_OPTION_SPSPPS) extended buffer, set all pointers and sizes to zero, and use only SPSId/PPSId fields. The same Sequence Parameter Set (SPS) and Picture Parameter Set (PPS) ID will be used for all temporal layers.<br /> <br /> Each temporal layer is defined by a Scale parameter. This is the ratio of frame rates between the base layer and the temporal layer. The application may skip some of the temporal layer(s) by setting the Scale parameter equal to zero. In this case temporal layers with corresponding temporal IDs will be absent from the stream. Also, the application must use an integer ratio of the frame rates for two consecutive temporal layers.<br /> <br /> Two consecutive temporal layers must have integer ratio of frame rates. For instance, let’s say that we have two layers 60fps and 30fps. An additional layer could not be set to 20 fps since the ratio between 30/20 is not an integer number. However an additional layer of 15 fps is accepted since 30/15 is an integer ratio.<br /> <br /> For instance, to enable encode of a 60 fps stream (<code>FrameRateExtN/FrameRateExtD</code> specifies frame rate of highest layer) with a base temporal stream of 15fps with temporal layers 30 and 60 fps the <code>mfxExtAvcTemporalLayers</code> extended buffer would be configured as follows:<br /> <br />
<pre><code>
mfxExtAvcTemporalLayers TemporalLayers;
memset(&amp;TemporalLayers, 0, sizeof(mfxExtAvcTemporalLayers));

TemporalLayers.BaseLayerPID = 0;   // Index of base layer 
TemporalLayers.Layer[0].Scale = 1; // base layer, 15fps/15fps = 1
TemporalLayers.Layer[1].Scale = 2; // first layer, 30fps/15fps = 2
TemporalLayers.Layer[2].Scale = 4; // second layer, 60fps/15fps = 4
TemporalLayers.Layer[3].Scale = 0; // No layer
</code></pre>
<br /><br />
<h2 class="sectionHeading">MJPEG Decode</h2>
Intel® Media SDK 2012 (using API 1.3) introduces a new MJPEG decoder component conforming to the ITU T.81 standard. At this point the MJPEG decoder is only available as a software implementation.<br /> <br /> The Intel Media SDK MJPEG decoder is enabled by using the <code>MFX_CODEC_JPEG</code> codec identifier and uses the same set of API function calls as the other SDK decoders.<br /> <br /> A key difference compared to other SDK decoders is that the MJPEG decoder can also deliver decoded video frames in the RGB4 color format (besides the common NV12 format). The decoder also supports frame rotation in steps of 90 degrees.<br /> <br /> For more details regarding the MJPEG decoder please refer to the Intel Media SDK MJPEG manual, the sample_decode or the Intel Media SDK Microsoft* DirectShow MJPEG sample filter code.<br /> <br /> As a side note, the SDK MJPEG decoder can also effectively be used as a single JPEG image decoder.<br /> <br />
<h2 class="sectionHeading">Conclusion</h2>
In this article we presented the new Intel® Media SDK 2012 video conferencing features: Low Latency Encode and Decode, Dynamic Bit Rate and Resolution Control, Forced Key Frame Generation, Reference List Selection, Reference Picture Marking Repetition SEI Message, Long Term Reference, Temporal Scalability and the new MJPEG decoder component.<br /> <br /> By utilizing these new features developers can build flexible video conferencing applications using Intel Media SDK taking advantage of Intel platforms hardware acceleration capabilities.<br /> <br /> For further details on how to use the features please refer to the Intel Media SDK samples included in the SDK install package.<br /> <br /> For developer questions on how to use Intel Media SDK please refer to the Intel® Media SDK forum on the Intel Software Network site: <a href="http://software.intel.com/en-us/forums/intel-media-sdk/">http://software.intel.com/en-us/forums/intel-media-sdk/</a><br /> <br />
<h2 class="sectionHeading">References</h2>
<ul>
<li>Intel® Media Software Development Kit: <a href="http://www.intel.com/software/mediasdk">www.intel.com/software/mediasdk</a></li>
<li>Microsoft* Windows SDK: <a href="http://msdn.microsoft.com/en-us/windows/aa904949.aspx" target="_blank">http://msdn.microsoft.com/en-us/windows/aa904949.aspx</a></li>
</ul>
<h2 class="sectionHeading">Terminology</h2>
<table class="tableFormat1" border="0" cellpadding="10" cellspacing="0">
<tbody>
<tr>
<td >Term</td>
<td >Description</td>
</tr>
<tr>
<td>DPB</td>
<td>Decode Picture Buffer</td>
</tr>
<tr>
<td>LTR</td>
<td>Long Term Reference (frame)</td>
</tr>
<tr>
<td>API</td>
<td>Application Programming Interface</td>
</tr>
<tr>
<td>DXVA</td>
<td>DirectX Video Acceleration</td>
</tr>
<tr>
<td>DDI</td>
<td>Device Driver Interface</td>
</tr>
<tr>
<td>SEI</td>
<td>Supplemental Enhancement Information</td>
</tr>
<tr>
<td>QSV</td>
<td>Intel® Quick Sync Video Technology</td>
</tr>
<tr>
<td>CQP</td>
<td>Constant Quantization Parameter</td>
</tr>
<tr>
<td>HRD</td>
<td>Hypothetical Reference Decoder</td>
</tr>
<tr>
<td>NAL</td>
<td>Network Abstraction Layer</td>
</tr>
<tr>
<td>SPS</td>
<td>Sequence Parameter Set</td>
</tr>
<tr>
<td>PPS</td>
<td>Picture Parameter Set</td>
</tr>
<tr>
<td>VBR</td>
<td>Variable Bit Rate</td>
</tr>
<tr>
<td>AVBR</td>
<td>Average Variable Bit Rate</td>
</tr>
<tr>
<td>CBR</td>
<td>Constant Bit Rate</td>
</tr>
<tr>
<td>MJPEG</td>
<td>Motion JPEG (ITU T.81 standard)</td>
</tr>
<tr>
<td>AVC</td>
<td>Advanced Vide Coding (ITU-T H.264 standard)</td>
</tr>
<tr>
<td>RGB4</td>
<td>RGB (Red, Green, Blue) pixel color format. A 32 bit format also known as RGB32</td>
</tr>
<tr>
<td>NV12</td>
<td>Common hybrid planar YUV color format</td>
</tr>
</tbody>
</table>
<div id="vc-meta" >
<div id="vc-meta-author">
<div></div>
</div>
<div id="vc-meta-pubdate">02-06-2012</div>
<div id="vc-meta-modificationdate">02-06-2012</div>
<div id="vc-meta-taxonomy">Tech Articles</div>
<div id="vc-meta-category-product">
<div></div>
</div>
<div id="vc-meta-category">
<div></div>
</div>
<div id="vc-meta-thumb"></div>
<div id="vc-meta-abstract">This paper explains how to use the new and optimized video conferencing features of Intel Media SDK 2012. The features address common video conferencing or streaming requirements for improved adaptation to transmission conditions, robustness and real-time responsiveness.</div>
</div> ]]></description>
      <link>http://software.intel.com/en-us/articles/video-conferencing-features-of-intel-media-software-development-kit/</link>
      <pubDate>Mon, 06 Feb 2012 00:00:00 -0800</pubDate>
      <comments>http://software.intel.com/en-us/articles/video-conferencing-features-of-intel-media-software-development-kit/#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/video-conferencing-features-of-intel-media-software-development-kit/</guid>
      <category>Visual Computing</category>
      <category>Visual Computing Source</category>
    </item>
    <item>
      <title>Visual Computing Source - News &amp; Events</title>
      <description><![CDATA[ <link media="screen" href="http://software.intel.com/sites/landingpage/vcsource/css/style.css" type="text/css" rel="stylesheet" />
<div class="events" id="wrap"><a href="http://software.intel.com/en-us/articles/vcsource/" id="logo">Visual Computing Source</a>
<div id="navigation"></div>
<!-- /navigation -->
<div id="container">
<ul id="breadcrumb">
<li><a href="http://software.intel.com/en-us/articles/vcsource/">Dashboard</a></li>
<li>News &amp; Events</li>
</ul>
<!-- /breadcrumb -->
<div id="social"></div>
<!-- /social -->
<div class="clear"></div>
<div id="customize">
<h2>Customize your visual computing content</h2>
<ul class="topic-nav">
<li><a rel="all" href="http://software.intel.com#">ALL</a></li>
<li><a rel="gaming" href="http://software.intel.com#">GAMING</a></li>
<li><a rel="media" href="http://software.intel.com#">MEDIA</a></li>
</ul>
</div>
<!-- /customize -->
<div class="clear"></div>
<div id="page-content">
<div id="left">
<div class="feed-container two-col" id="vcsource_type_event"></div>
</div>
<!-- left -->
<div id="rhc">
<div class="rhc-box filter" id="filter"><!-- filter for feed items --></div>
<!-- /rhc-box --></div>
<!-- /rhc --></div>
<!-- /page-content --></div>
<!-- /container -->
<div class="clear"></div>
</div>
<!-- /wrap -->
<div class="clear"></div>
<link href="http://software.intel.com/sites/landingpage/vcsource/css/ie8.css" type="text/css" rel="stylesheet" />

 ]]></description>
      <link>http://software.intel.com/en-us/articles/vcsource-news-events/</link>
      <pubDate>Mon, 06 Feb 2012 00:00:00 -0800</pubDate>
      <comments>http://software.intel.com/en-us/articles/vcsource-news-events/#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/vcsource-news-events/</guid>
      <category>Visual Computing</category>
      <category>Visual Computing Source</category>
    </item>
    <item>
      <title>SIGCSE Intel Parallelism Lightning Rounds</title>
      <description><![CDATA[ <b><img src="http://software.intel.com/file/41234" />
<p> </p>
<p> </p>
Welcome to the Intel® Academic Community &amp; EAPF Parallelism Lightning Rounds at SIGCSE 2012.</b>
<div><b><br /></b></div>
<b>What are the Parallelism Lightning Rounds?<br /></b>
<p>At SIGCSE 2012, we are looking to ignite attendees on the wide range of teaching content, tools, games and examples that can be used in the class room to introduce parallel concepts into computer science and computational sciences at many levels.</p>
<p>Each presenter will get 5 minutes and five slides to demonstrate how they have brought parallelism into their curriculum.</p>
<div><span >This is meant to be a fun, informative and informed event</span> based on the <a target="_blank" href="http://en.wikipedia.org/wiki/Ignite_(event)">Ignite Live Forums</a>.  Ignite is a global event, organized by volunteers, where participants are given five minutes to talk about their ideas and personal or professional passions, accompanied by limited slides. The presentations are meant to "ignite" the audience on a subject, i.e. to generate awareness and to stimulate thought and action on the subjects presented.</div>
<p> </p>
<p><b>Who can apply?<o:p></o:p></b></p>
<p>This event is open to any SIGCSE attendee teaching, at the K-12, college or graduate level.   You must be over 18 to participate.<o:p></o:p></p>
<p> </p>
<p><b>How to apply?<o:p></o:p></b></p>
<p>Simply send one or two paragraphs describing the content for the 5 slides that you plan to show during the event to <a href="http://software.intel.commailto:academic.community@intel.com?subject=Lightning Rounds">academic.community@intel.com</a>.  Please put <i>Lightning Round</i> in the subject.</p>
<p>All entries must include:</p>
<ol>
<li><span >Your name </span></li>
<li><span >Your email</span></li>
<li><span >A phone number where you can be reached while at SIGCSE</span></li>
<li><span >The institution where you teach</span></li>
<li><span >What classes this material would be best suited for </span><span >(e.g. 1</span><sup >st</sup><span > year CS Data Structures class)</span></li>
<li><span >A short paragraph detailing the specific topic, technique or problem will you address in your  five minute lightning round</span></li>
<li><span >A description of the 5 Slides you will present </span></li>
</ol>
<p class="MsoNormal"><b>How will decisions be made regarding presenters selected for the event?</b></p>
<p class="MsoNormal">The EAPF SIGCSE Committee will select participants based relevance, clarity, and creativity.</p>
<p class="MsoNormal"><b>What are the fabulous prizes?</b></p>
<p class="MsoNormal"><span>The first 25 people who submit appropriate content will receive a copy of Dr. Clay <span>Breshears</span> book, </span><i>The Art of Concurrency: A Thread Monkey's Guide to Writing Parallel Applications</i><span>.</span></p>
<p class="MsoNormal"><span>Everyone who presents will be entered into a drawing for <span>an Asus</span> - <span>Zenbook</span> <span>Ultrabook™</span> with Intel® Core™ i5 Processor / 13.3" Display / 4GB Memory / 128GB Solid State Drive</span></p>
<p class="MsoNormal">(You can also enter this drawing by visiting the Intel booth at SIGCSE and participating in our passport program).</p>
<p class="MsoNormal"><b>Is this award only for Computer Science instructors?<o:p></o:p></b></p>
<p class="MsoNormal">No. We welcome material applicable to a wide array of disciplines.</p>
<p class="MsoNormal"><o:p></o:p></p>
<p class="MsoNormal"><o:p></o:p></p>
<p class="MsoNormal"><o:p></o:p></p>
<p class="MsoNormal"><b>Example content ideas:</b></p>
<p class="MsoNormal">We welcome all entries and encourage you to use your knowledge and creativity.  Below are some examples of great content ideas –What are yours?<o:p></o:p></p>
<ol type="1" >
<li class="MsoNormal"><span>Describe how you <span>parallelized</span> the standard merge sort algorithm and how you introduced it into your classroom.</span><o:p></o:p></li>
<li class="MsoNormal">Showcase a classroom exercise or homework assignment to <o:p></o:p>describe a queue data structure in a parallel setting. </li>
<li class="MsoNormal">Show how you used an existing textbook and added Parallel examples or problems.<o:p></o:p></li>
<li class="MsoNormal">Describe how you challenge your students to find ways to scale beyond four cores.<o:p></o:p></li>
<li class="MsoNormal">Describe  a game you used to illustrate parallel concepts. <o:p></o:p></li>
<li class="MsoNormal">Tell us how you introduced parallel libraries</li>
<li class="MsoNormal">Talk about your experiences using programming environments such as Alice or Scratch to introduce parallelism.</li>
</ol>
<p class="MsoNormal"><b>What are the conditions?<o:p></o:p></b></p>
<p class="MsoNormal">Winners will be contacted by phone and e-mail and must be present at SIGCSE in order to win.  All prizes must be picked up at the Intel booth by 12 pm, Saturday, March 3rd, 2012 after which time they will be considered forfeit.</p>
<p class="MsoNormal">All entries and subsequent material will be made available to a wide community under the terms of the Creative Commons license.<o:p></o:p></p>
<p class="MsoNormal">Intel reserves the right to use the entries in any fashion including the right to publicize entries and winners in promotional material. Intel reserves the right to substitute prizes of equal value. Decisions of the judges are final.<o:p></o:p></p>
<o:p></o:p><!--EndFragment--> ]]></description>
      <link>http://software.intel.com/en-us/articles/sigcse-parallelism-lightning-rounds/</link>
      <pubDate>Mon, 06 Feb 2012 00:00:00 -0800</pubDate>
      <comments>http://software.intel.com/en-us/articles/sigcse-parallelism-lightning-rounds/#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/sigcse-parallelism-lightning-rounds/</guid>
      <category>ISN General</category>
      <category>Academic</category>
    </item>
    <item>
      <title>Intel® VTune™ Amplifier XE 2011 for Windows Evaluation</title>
      <description><![CDATA[ <!-- CMD code start -->
<link href="http://software.intel.com/sites/products/web2010/css/custom-theme/jquery-ui-1.8.4.custom.css" type="text/css" rel="stylesheet" />
<link href="http://software.intel.com/sites/products/web2010/css/dpdstyle.css" type="text/css" rel="stylesheet" />
<link href="http://software.intel.com/sites/products/web2010/shadowbox-3.0.3/shadowbox.css" type="text/css" rel="stylesheet" />
<link media="screen, projection" href="http://software.intel.com/sites/products/web2010/css/ie.css" type="text/css" rel="stylesheet" />






<!-- CMD code end -->

<div id="wrap"><!-- Navigation --> 
<ul class="nav current-products">
<li class="home"><a href="http://software.intel.com/en-us/intel-sdp-home/">Home</a></li>
<li class="products"><a href="http://software.intel.com/en-us/articles/intel-sdp-products/">Products</a></li>
<li class="news"><a href="http://software.intel.com/en-us/articles/intel-sdp-news/">News</a></li>
<li class="resources"><a href="http://software.intel.com/en-us/articles/intel-sdp-resources/">Resources</a></li>
<li class="support"><a href="http://software.intel.com/en-us/articles/intel-software-developer-support/">Support</a></li>
<li class="store"><a href="http://software.intel.com/en-us/articles/buy-or-renew/">Store</a></li>
</ul>
<!--Top Header -->
<div id="product_component_header">
<table>
<tbody>
<tr>
<td><img height="107" width="13" src="http://software.intel.com/sites/products/web2010/images/transparent.gif" /></td>
<td><a name="pageheader" id="pageheader"></a><span ><b>Intel® VTune™ Amplifier XE 2011 for Windows* - Evaluation</b></span><br /><br /></td>
</tr>
</tbody>
</table>
</div>
<!--End Top Header --><!-- Content Container -->
<div id="tabbox">
<div align="top" class="notab-box-shadow" id="contentwell">
<div >
<table>
<tbody>
<tr>
<td>
<p>In order to install and use the Intel® VTune™ Amplifier XE 2011 for Windows* product, you must have already installed a supported Microsoft development product. Please see the <a target="blank" href="http://software.intel.com/en-us/articles/intel-c-composer-xe-2011-release-notes/">System Requirements</a> for more information. If you do not already have a supported Microsoft development product installed, you may download a <a target="blank" href="http://www.microsoft.com/downloads/en/details.aspx?FamilyID=26bae65f-b0df-4081-ae6e-1d828993d4d0&amp;displaylang=en ">free 90-day trial version of Microsoft Visual Studio 2010*</a> from Microsoft. This will allow for full functionality of the product during your evaluation. If, at the end of evaluation, you choose to purchase Intel® VTune™ Amplifier XE 2011 for Windows*, you must also purchase a license for a supported Microsoft development product if you do not already have one.</p>
<p>If you understand and accept this requirement, click on the Accept button to proceed with your evaluation.</p>
<table align="center" width="300" cellpadding="0" cellspacing="0" border="0" class="sectionBodyText">
<tbody>
<tr>
<td><a href="https://registrationcenter.intel.com/RegCenter/AutoGen.aspx?ProductID=1503&amp;AccountID=&amp;EmailID=&amp;ProgramID=&amp;RequestDt=&amp;rm=EVAL&amp;lang="><img src="http://software.intel.com/file/27597" /></a></td>
<td width="10"></td>
<td><a href="http://software.intel.com/en-us/articles/intel-software-evaluation-center/"><img src="http://software.intel.com/file/27598" /> </a></td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
</div>
</div>
</div>
</div> ]]></description>
      <link>http://software.intel.com/en-us/articles/intel-vtune-amplifier-xe-2011-for-windows-evaluation/</link>
      <pubDate>Fri, 03 Feb 2012 00:00:00 -0800</pubDate>
      <comments>http://software.intel.com/en-us/articles/intel-vtune-amplifier-xe-2011-for-windows-evaluation/#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/intel-vtune-amplifier-xe-2011-for-windows-evaluation/</guid>
      <category>Intel Software Network communities</category>
    </item>
    <item>
      <title>Intel® Inspector XE 2011 for Windows Evaluation</title>
      <description><![CDATA[ <!-- CMD code start -->
<link href="http://software.intel.com/sites/products/web2010/css/custom-theme/jquery-ui-1.8.4.custom.css" type="text/css" rel="stylesheet" />
<link href="http://software.intel.com/sites/products/web2010/css/dpdstyle.css" type="text/css" rel="stylesheet" />
<link href="http://software.intel.com/sites/products/web2010/shadowbox-3.0.3/shadowbox.css" type="text/css" rel="stylesheet" />
<link media="screen, projection" href="http://software.intel.com/sites/products/web2010/css/ie.css" type="text/css" rel="stylesheet" />






<!-- CMD code end -->

<div id="wrap"><!-- Navigation --> 
<ul class="nav current-products">
<li class="home"><a href="http://software.intel.com/en-us/intel-sdp-home/">Home</a></li>
<li class="products"><a href="http://software.intel.com/en-us/articles/intel-sdp-products/">Products</a></li>
<li class="news"><a href="http://software.intel.com/en-us/articles/intel-sdp-news/">News</a></li>
<li class="resources"><a href="http://software.intel.com/en-us/articles/intel-sdp-resources/">Resources</a></li>
<li class="support"><a href="http://software.intel.com/en-us/articles/intel-software-developer-support/">Support</a></li>
<li class="store"><a href="http://software.intel.com/en-us/articles/buy-or-renew/">Store</a></li>
</ul>
<!--Top Header -->
<div id="product_component_header">
<table>
<tbody>
<tr>
<td><img height="107" width="13" src="http://software.intel.com/sites/products/web2010/images/transparent.gif" /></td>
<td><a name="pageheader" id="pageheader"></a><span ><b>Intel® Inspector XE 2011 for Windows* - Evaluation</b></span><br /><br /></td>
</tr>
</tbody>
</table>
</div>
<!--End Top Header --><!-- Content Container -->
<div id="tabbox">
<div align="top" class="notab-box-shadow" id="contentwell">
<div >
<table>
<tbody>
<tr>
<td>
<p>In order to install and use the Intel® Inspector XE 2011 for Windows* product, you must have already installed a supported Microsoft development product. Please see the <a target="blank" href="http://software.intel.com/en-us/articles/intel-c-composer-xe-2011-release-notes/">System Requirements</a> for more information. If you do not already have a supported Microsoft development product installed, you may download a <a target="blank" href="http://www.microsoft.com/downloads/en/details.aspx?FamilyID=26bae65f-b0df-4081-ae6e-1d828993d4d0&amp;displaylang=en ">free 90-day trial version of Microsoft Visual Studio 2010*</a> from Microsoft. This will allow for full functionality of the product during your evaluation. If, at the end of evaluation, you choose to purchase Intel® Inspector XE 2011 for Windows* you must also purchase a license for a supported Microsoft development product if you do not already have one.</p>
<p>If you understand and accept this requirement, click on the Accept button to proceed with your evaluation.</p>
<table align="center" width="300" cellpadding="0" cellspacing="0" border="0" class="sectionBodyText">
<tbody>
<tr>
<td><a href="https://registrationcenter.intel.com/RegCenter/Evalform.aspx?productid=1511"><img src="http://software.intel.com/file/27597" /></a></td>
<td width="10"></td>
<td><a href="http://software.intel.com/en-us/articles/intel-software-evaluation-center/"><img src="http://software.intel.com/file/27598" /> </a></td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
</div>
</div>
</div>
</div> ]]></description>
      <link>http://software.intel.com/en-us/articles/intel-inspector-xe-2011-for-windows-evaluation/</link>
      <pubDate>Fri, 03 Feb 2012 00:00:00 -0800</pubDate>
      <comments>http://software.intel.com/en-us/articles/intel-inspector-xe-2011-for-windows-evaluation/#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/intel-inspector-xe-2011-for-windows-evaluation/</guid>
      <category>Intel Software Network communities</category>
    </item>
    <item>
      <title>OpenMP loop does not parallelize with continue statement in the catch block</title>
      <description><![CDATA[ A "continue" statement inside the "catch" block of a try/catch construct inside a for loop within an OpenMP parallel region, inhibits the compiler's ability to parallelize the OpenMP loop as shown below:<br /><br />
<p> </p>
<p>#include &lt;iostream&gt;<br />#include "omp.h"<br />using namespace std;<br />const int MAX = 6000;</p>
<p>int main(){</p>
<p>#pragma omp parallel for  <br />    for(int i=0; i&lt; MAX; i++){<br />      try{<br />          cerr &lt;&lt; "testing ... " &lt;&lt;endl;<br />      }   catch (exception&amp; e){<br />            #ifdef CONTINUE<br />              continue;<br />            #endif<br />          }<br />    }<br />  return 0;<br />}</p>
<p>// Loop parallelizes without the "continue" statement<br /><strong>&gt;icl -c -Qopenmp -Qopenmp-report2 /EHsc test.cpp<br /></strong>Intel(R) C++ Compiler XE for applications running on IA-32, Version 12.1.2.278 Build 20111128<br />Copyright (C) 1985-2011 Intel Corporation.  All rights reserved.</p>
<p>test.cpp<br />test.cpp(9): (col. 1) remark: OpenMP DEFINED LOOP WAS PARALLELIZED.</p>
<p>// Loop does not parallelize with the "continue" statement <br /><strong>&gt;icl -c /D CONTINUE -Qopenmp -Qopenmp-report2 /EHsc test.cpp<br /></strong>Intel(R) C++ Compiler XE for applications running on IA-32, Version 12.1.2.278 Build 20111128<br />Copyright (C) 1985-2011 Intel Corporation.  All rights reserved.</p>
<p>test.cpp</p>
<p><br />This is a known issue that is under investigation, and may be resolved in a future version of the compiler. </p> ]]></description>
      <link>http://software.intel.com/en-us/articles/openmp-loop-does-not-parallelize-with-continue-statement-in-the-catch-block/</link>
      <pubDate>Fri, 03 Feb 2012 00:00:00 -0800</pubDate>
      <comments>http://software.intel.com/en-us/articles/openmp-loop-does-not-parallelize-with-continue-statement-in-the-catch-block/#comments</comments>
      <guid isPermaLink="true">http://software.intel.com/en-us/articles/openmp-loop-does-not-parallelize-with-continue-statement-in-the-catch-block/</guid>
      <category>Intel® C++ Compiler for Linux* Knowledge Base</category>
      <category>Intel® C++ Compiler for Mac OS X* Knowledge Base</category>
      <category>Intel® C++ Compiler for Windows* Knowledge Base</category>
      <category>Intel® Parallel Composer Knowledge Base</category>
    </item>
  </channel></rss>
